Epididymal lipocalin gene and uses thereof

ABSTRACT

Isolated nucleic acids comprising a lipocalin gene promoter region, isolated nucleic acids comprising a human lipocalin gene, isolated nucleic acids encoding a lipocalin polypeptide, isolated lipocalin polypeptides, and uses thereof. The disclosed lipocalin nucleic acids and polypeptides can be used to generate a mouse model of male infertility, for drug discovery screens, and for therapeutic treatment of fertility-related conditions.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is based on and claims priority to U.S.Provisional Patent Application Serial No. 60/258,655 filed Dec. 29,2000, the entire contents of which are herein incorporated by reference.

GRANT STATEMENT

[0002] This work was supported by NICHD grant HD36900. Thus, the U.S.Government has rights in the invention.

FIELD OF THE INVENTION

[0003] The present invention generally relates to epididymal functionand male fertility. More particularly, the present invention provideslipocalin nucleic acid and polypeptide sequences, a lipocalin genepromoter region that directs gene expression in the epididymis, chimericgenes comprising disclosed lipocalin sequences, and uses thereof. Tableof Abbreviations CAT chloramphenicol acetyl transferase ES embryonicstem cell FCS fluorescence correlation spectroscopy hEP17 humanEpididymal Protein 17 mE-RABP mouse Epididymal Retinoic Acid BindingProtein mEP17 mouse Epididymal Protein 17 MS mass spectroscopy PCRpolymerase chain reaction PGK phosphoglycerate kinase pLN-17 mEP17targeting vector for homologous recombination RAR Retinoic Acid ReceptorRT-PCR reverse transcription polymerase chain reaction SELDIsurface-enhanced laser desorption/ ionization SPR surface plasmonresonance TOF time of flight mass spectroscopy

BACKGROUND ART

[0004] Recent studies of reproductive frequency in the United Statesreport that 7% of married couples (greater than 2 million couples)describe difficulty in achieving a pregnancy. See Fidler and Bernstein(1999) Public Health Reports 114:494-511. Many individuals now seekmedical support for conception, including infertility diagnosis andassisted reproductive treatment. The monetary costs for such servicesare substantial, and financial commitment must increase to include pre-and post-natal care of multiple birth pregnancies often associated withinfertility treatment. A tumult of legal and ethical issues have emergedregarding the rights of parents and unborn children that are conceivedby an unconventional method. The escalating magnitude of monetary,legal, and ethical concerns when considering pregnancy has establishedinfertility as a significant public health issue.

[0005] Rational treatment approaches for many andrological disordersresulting in infertility are still lacking. See Kamischke and Nieschlag(1999) Human Reproduction 14(Suppl. 1):1-23. The cause of maleinfertility is often unidentifiable, referred to generally as“idiopathic infertility”, or the presumed pathology is not yet met withan unequivocal therapy. Intracytoplasmic sperm injection has been asuccessful method for enabling fertilization in many cases, yet a lessinterventive treatment is still sought. The historical use of approachesnow deemed ineffective emphasizes the importance of thorough studiesduring early stages of therapy development. In particular, there is aneed for more sophisticated diagnostic tools that detect molecular basesof male infertility and for non-surgical therapies that are supported bysolid physiological data. An initial effort in this regard is thedevelopment of animal models of male infertility.

[0006] A related medical incentive is the development of new methods forcontraception. See Baird and Glasier (1999) BMJ 319:969-972. Theprevalence of contraceptive use is increasing worldwide, however,existing contraceptive means are limited by adverse side effects,inconvenience, and remaining instances of ineffectiveness. Inparticular, there are presently no safe and reversible means for malecontraception. One strategy that has been explored recently is animmunological approach for disrupting endocrine or physiological eventsthat normally promote pregnancy. Vaccines that comprise antigens ofsperm plasma membrane proteins, zona pellucida proteins of the egg, orgonadotropin releasing hormone have shown success in suppressingfertility when administered to several mammalian subjects, includinghumans. See U.S. Pat. Nos. 6,096,318 and 6,132,270; Barber andFayrer-Hosken (2000) J Reprod Immunol 46:103-124; Paterson et al. (2000)Cells Tissues Organs 166:228-232; Srivastav (2000) J Reprod Fertil119:241-252; Feng et al. (1999) J Reprod Med 44:759-765; Naz (1999)Immunol Rev 171:193-202; Talwar (1999) Immunol Rev 171:173-192.

[0007] Although animal use and clinical trials of immunocontraceptivevaccines are encouraging, existing vaccines present significantcomplications, for example, auto-immune reactions. Thus, currentresearch is focused on identifying new antigens that can provide safervaccines. Animal models with implications for post-testicular human malecontraceptives acting at the epididymis offer promising leads. SeeNikkanen et al. (2000) Contraception 61:401-406; Cooper and Yeung (1999)Hum Reprod Update 5:141-152. To establish molecular targets for vaccineand drug development, proteins that are essential for epididymalfunction have been identified (Srivastav, 2000; Diekman et al. (1999)Immunol Rev 171:203-211; Costa et al. (1997) Biol Reprod 56:985-990;Sonnenberg-Riethmacher et al. (1996) Genes Dev 10:1184-1193).

[0008] During their transit through the epididymis, spermatozoa undergobiochemical and morphological changes to acquire motility and theability to fertilize an oocyte in vivo. The maturation process occursprogressively along the epididymal duct and is believed to depend onepididymal secretory proteins. The epididymal epithelial cells secreteproteins in a highly regulated and regionalized manner such thatspermatozoa encounter luminal fluid protein in a specific sequence.Indeed, each region within the epididymis is a unique microenvironmentadapted with a characteristic milieu of ions, organic solutes, proteins,and steroids. See Cornwall et al. (2001) in “The Epididymis”, PlenumPress. Spatially-restricted gene expression as well as regionaldifferences in cellular morphology define three distinctive regions ofthe epididymis known as the caput, corpus, and cauda. These structuralsubdivisions and highly regionalized gene expression therein areobserved in the epididymis of several organisms, including humans (Krullet al. (1993) Mol Reprod Dev 34:16-34).

[0009] Regionalization of the epididymis likely fulfills an essentialand cumulative role in the maturation and survival of spermatozoa. Insupport thereof, targeted mutation of the mouse c-ros tyrosine kinasereceptor confers male sterility, although sperm production is notaffected. c-ros is normally expressed in the initial segment of theepididymis, and animals lacking c-ros function show specificunderdevelopment and lack of cellular differentiation within the initialsegment (Sonnenberg-Riethmacher et al., 1996). Sperm taken from a c-rosmutant mouse are less motile due to flagellar angulation, suggestingthat failure of differentiation of epithelial cells in one segment ofthe epididymis can affect sperm maturation and survival (Yeung et al.(1999) Biol Reprod 61:1062-1069).

[0010] To generate regionalization within the epididymal epithelium,gene expression is precisely controlled, in part through transcriptionalregulation. Transcription factors modulate transcription by binding DNAcis-regulatory sequences, most often located upstream of the genepromoter and transcription start site, and by concomitantly affectingassembly of cellular transcriptional machinery at the relevant promoter.Therefore, cis-regulatory sequences of epididymal-specific genes and thetranscription factors that are operative through these sites areimportant elements in understanding epididymal function as itcontributes to sperm maturation. Current approaches to identifymechanisms involved in region-specific gene expression in the epididymishave been limited by a lack of identified transcriptional regulatoryproteins which are key to this process. See Cornwall et al. (2001).

[0011] Candidate regulators include components of retinoid signalingpathways. Most elements known to be involved in retinoid signaling arepresent in the epididymis, including epididymal retinoic acid bindingprotein (mE-RABP), cellular retinol-binding protein type I (CRBP I),cellular retinoic acid binding protein type I (CRABP I), retinoic acidreceptor alpha (RARα), retinoic acid, and retinyl esters. Moreover,studies addressing the function of such elements emphasize the importantrole of retinoid signaling pathways in epididymal integrity. In retinoiddeficient animals, there is widespread squamous metaplasia andkeratinization of the epididymal epithelium (Wolbach (1925) J Exp Med42:753-777), and abnormal synthesis and secretion of several epididymalproteins (Astraudo et al. (1995)Arch Androl35:247-259). Similarly,overexpression of a dominant negative form of RARα leads todisorganization of the epididymal epithelium and concomitant infertility(Costa et al., 1997). In a related study, RARα knockout mice displayaspermatogenesis and vacuolization of the epididymal epithelium (Lufkinet al. (1993) Proc Natl Acad Sci USA 90:7225-7229), and animals lackingboth RARα and RARγ function show epididymal dysplasia (Mendelsohn et al.(1994) Development 120:2749-2771).

[0012] The mE-RABP protein is of particular interest among regulators ofretinoid signaling, as it appears to be expressed selectively in the midand distal caput of the epididymis. mE-RABP is a member of a family ofsecreted lipocalin proteins. Structural analyses reveal that lipocalinscomprise an eight-stranded β barrel that is closed at one end by anα-helical turn, thereby forming a hydrophobic binding cavity. Thishydrophobic pocket is well-adapted for noncovalent binding and transportof small lipophilic ligands. mE-RABP binds active retinoids (9-cis andall-trans retinoic acid), and functions as a retinoid carrier protein inthe epididymis. See Ong et al. (2000) Biochim Biophys Acta1482(1-2):209-17.

[0013] Recent studies by the co-inventors of the present applicationhave identified a similar gene encoding a 17 kDa lipocalin, MouseEpididymal Protein of 17 kDa (mEP17) (Lareyre et al. (2001)Endocrinology 142:1296-1306). mEP17 and mE-RABP are significantlyrelated by several measures that collectively suggest mEP17 alsofunctions as a regulator of retinoid signaling in the epididymis. First,mE-RABP and mEP17 are positioned adjacent to each other on mousechromosome 2. Exon/intron boundaries are strictly conserved betweenmE-RABP and mEP17, supporting that these genes arose by geneduplication. Second, mEP17 shows regionalized expression in theepididymis. mEP17 expression is limited to the initial segment of thecaput epididymis, while mE-RABP is expressed in the adjacent mid anddistal caput epididymis. Third, the mEP17 protein contains two motifs(G-X—W and T-D-Y) and two cysteine residues that are characteristicfeatures shared by members of the lipocalin protein family. With theexception of these motifs, mEP17 shows low sequence similarity withother known lipocalins. However, it is well established that lipocalinfamily members do not show significant sequence homology (average 25%identity and 50% homology between representative members). Rather,lipocalins are more clearly related by assessing homology of secondaryand tertiary structure. The tryptophan residue of the G-X—W motif isrequired for binding of lipophilic ligands, and the two cysteineresidues form a intramolecular disulfide bond that influences ligandaffinity. In addition, a putative signal sequence at the amino-terminalof the mEP17 precursor suggests that it is cleaved to generate a maturesecreted protein, consistent with its identification as a lipocalin.These structural similarities between mEP17 and other lipocalins, mostsignificantly mE-RABP, suggest that mEP17 is also a carrier for retinoidligands.

[0014] The present invention relates to a current challenge indeveloping animal models of infertility, male fertility treatments, andmale contraceptives. To this end, the present invention provides anisolated promoter region of the mEP17 gene, an isolated nucleic acidmolecule encoding a human mEP17 gene (hEP17), an isolated promoterregion of hEP17, and chimeric genes comprising the disclosed sequences.Host cells expressing a recombinant EP17 gene or an mEP17 promoterregion operably linked to a reporter gene sequence are useful inscreening assays for discovery of substances that modulate EP17. Achimeric gene comprising an mEP17 promoter region can also be used todirect transcription of a heterologous nucleotide sequence in theepididymis of a host organism. The present invention further provides anEP17 polypeptide that can be used for vaccine or drug development. Byprovision of epididymal lipocalin nucleotide and polypeptide sequences,and methods for using the same, the present invention meets a long-feltneed for advancement in fertility research.

SUMMARY OF THE INVENTION

[0015] The present invention provides an isolated promoter region of anEP17 gene that reconstitutes endogenous expression in epididymis. In onepreferred embodiment, a promoter region of the invention comprises a 5.3kb fragment (GenBank Accession No. AF08222) of mouse genomic clone 10983(Genome Systems, Inc.) between the EcoRV and SalI restriction sites, orfunctional portion thereof. More preferably, the functional portion ofthe promoter region comprises a TATA box and at least one cis-actingregulatory sequence selected from the group including but not limited toa Sp-1 binding site, an AP-1 binding site, a retinoic acid receptorbinding site, an androgen receptor binding site, a C-Ets binding site, aSRY binding site, an APA binding site, a C/EBP binding site, andcombinations thereof. Most preferably, an isolated promoter region ofthe present invention comprises the nucleotide sequence of SEQ ID NO:1,a nucleic acid molecule substantially identical to SEQ ID NO:1, or a 20base pair nucleotide sequence identical to a contiguous 20 base pairnucleotide portion of SEQ ID NO:1.

[0016] The present invention also provides a human EP17 gene.Preferably, the human EP17 gene comprises the sequence set forth as SEQID:2, a nucleic acid molecule that is substantially similar to SEQ IDNO:2; or a nucleic acid molecule comprising a 20 base pair nucleotidesequence that is identical to a contiguous 20 base pair sequence of SEQID NOs:2. The present invention further provides an isolated promoterregion derived from a human EP17 gene. In this case, a hEP17 promoterregion is preferably an about 5160 base pair region immediately upstreamof the human EP17 transcription start site. More preferably, an isolatedpromoter region of the present invention comprises a TATA box and atleast one cis-acting regulatory sequence selected from the groupincluding but not limited to Sp-1 binding site, an AP-1 binding site, acAMP response element binding protein (CREB) binding site, a SRY-relatedHMG-box gene 5 (Sox5) binding site, a Sex-determining region Y geneproduct (SRY) binding site, a c-Ets binding site, a GATA binding site,an Octamer transcription factor 1 (Oct-1) binding site, and combinationsthereof. Most preferably, an isolated promoter of the present inventioncomprises the nucleotide sequence of SEQ ID NO:5, a nucleic acidmolecule substantially identical to SEQ ID NO:5, or a 20 base pairnucleotide sequence identical to a contiguous 20 base pair nucleotideportion of SEQ ID NO:5.

[0017] The present invention further provides a chimeric gene comprisingan EP17 promoter region operably linked to a heterologous nucleotidesequence. Preferably, the EP17 promoter region comprises the nucleicacid molecule of SEQ ID NOs:1 or 5, or functional portion thereof. In apreferred embodiment, a chimeric gene of the invention is carried in avector and expressed in a host cell including but not limited to abacterial cell, a hamster cell, a mouse cell, or a human cell.

[0018] The present invention also provides a transgenic animal having atransgene that comprises a chimeric gene of the present invention. In apreferred embodiment, expression of the chimeric gene alters fertilityof the host animal.

[0019] The present invention also provides a method for identifying asubstance that regulates EP17 expression using a chimeric gene thatincludes an isolated EP17 promoter region operably linked to a reportergene. According to this method, a gene expression system is establishedthat includes the chimeric gene and components required for genetranscription and translation so that reporter gene expression isassayable. To select a substance that regulates EP17 expression, themethod further provides the steps of using the gene expression system todetermine a baseline level of reporter gene expression in the absence ofa candidate regulator, providing a plurality of candidate regulators tothe gene expression system, and assaying a level of reporter geneexpression in the presence of a candidate regulator. A candidateregulator is selected whose presence results in an altered level ofreporter gene expression when compared to the baseline level.Preferably, the Isolated EP17 promoter region used in this methodcomprises the sequence of SEQ ID NOs:1 or 5, or functional portionthereof.

[0020] In another aspect of the invention, a method is provided forproducing an epididymal cell line using a chimeric gene comprising anEP17 promoter operably linked to a gene encoding a selectable marker.According to the method, a transgenic animal is generated that expressesa selectable marker gene. Preferably, the selectable marker gene is anantibiotic resistance gene. More preferably, the antibiotic resistancegene is a neomycin resistance gene. Epididymal cells are procured fromthe transgenic animal and stably reproduced in cell culture usingselection of the marker gene. Preferably, the EP17 promoter region usedto perform this method is the nucleic acid molecule of SEQ ID NO:1, orfunctional portion thereof.

[0021] Another aspect of the present invention pertains to a method formutagenizing an EP17 locus by homologous recombination. The method usesa targeting vector having an isolated EP17 promoter region, a markergene, and an isolated EP17 3′flanking region. In a vector soconstructed, the marker gene is positioned between the promoter regionand the 3′flanking region. In one embodiment, the targeting vectorfurther comprises a mutant EP17 coding sequence, also positioned betweenthe promoter region and the 3′ flanking region. The targeting vector islinearized by digestion with a restriction endonuclease at a site otherthan within the promoter region, marker gene, 3′ flanking region, andoptional mutant EP17 coding sequence. The linearized vector isintroduced into embryonic stem cells and is assayed by detecting themarker gene in the stem cells. Stem cells bearing the vector are used tocreate a transgenic vertebrate animal. According to the method, ahomologous recombination event is mediated at the EP17 locus, therebyexchanging native mEP17 gene sequences positioned between the promoterregion and the 3′ flanking region with vector nucleotide sequencespositioned the same. In a more preferred embodiment, male EP17 mutantanimals produced by the disclosed method are sterile.

[0022] The present invention also discloses a human EP17 polypeptide andan isolated nucleic acid sequence encoding the same. Preferably, anisolated EP17 polypeptide, or functional portion thereof, comprises apolypeptide encoded by the nucleic acid molecule of SEQ ID NO:3, apolypeptide encoded by a nucleic acid molecule that is substantiallyidentical to SEQ ID NO:3, a polypeptide fragment encoded by a 20nucleotide sequence that is identical to a contiguous 20 nucleotidesequence of SEQ ID NO:3; a polypeptide having an amino acid sequence ofSEQ ID NO:4, a polypeptide that is a biological equivalent of thepolypeptide of SEQ ID NO:4, or a polypeptide that is immunologicallycross-reactive with an antibody that shows specific binding with apolypeptide comprising some or all amino acids of SEQ ID NO:4.Preferably, the polypeptide of the present invention comprises a humanEP17 polypeptide.

[0023] The present invention further teaches chimeric genes having aheterologous promoter that drives expression of a nucleic acid sequenceencoding an EP17 polypeptide. Preferably, the chimeric gene is carriedin a vector and introduced into a host cell so that an EP17 polypeptideof the present invention is produced. Preferred host cells include butare not limited to a bacterial cell, a hamster cell, a mouse cell, or ahuman cell.

[0024] In another aspect of the invention, a method is provided fordetecting a nucleic acid molecule that encodes an EP17 polypeptide.According to the method, a biological sample having nucleic acidmaterial is hybridized under stringent hybridization conditions to anEP17 nucleic acid molecule of the present invention. Such hybridizationenables a nucleic acid molecule of the biological sample and the EP17nucleic acid molecule to form a detectable duplex structure. Preferably,the EP17 nucleic acid molecule includes some or all nucleotides of SEQID NOs:1, 2, 3, or 5. Also preferably, the biological sample compriseshuman nucleic acid material.

[0025] The present invention further teaches an antibody thatspecifically recognizes an EP17 polypeptide. Preferably, the antibodyrecognizes some or all amino acids of SEQ ID NO:4. A method forproducing an EP17 antibody is also disclosed, and the method comprisesrecombinantly or synthetically producing an EP17 polypeptide, or portionthereof; formulating the EP17 polypeptide so that it is an effectiveimmunogen; immunizing an animal with the formulated polypeptide togenerate an immune response that includes production of EP17 antibodies;and collecting blood serum from the immunized animal containingantibodies that specifically recognize an EP17 polypeptide. Preferably,the EP17 polypeptide used as an immunogen includes some or all aminoacid sequences of SEQ ID NO:4.

[0026] A method is also provided for detecting a level of EP17polypeptide using an antibody that specifically recognizes an EP17polypeptide. According to the method, a biological sample is obtainedfrom an experimental subject and a control subject, and EP17 polypeptideis detected in the sample by immunochemical reaction with the EP17antibody. Preferably, the antibody recognizes amino acids of SEQ ID NO:4and is prepared according to a method of the present invention forproducing such an antibody.

[0027] The present invention further discloses a method for identifyinga compound that modulates EP17 function. The method comprises: exposingan isolated EP17 polypeptide to a plurality of compounds; and assayingbinding of a compound to the isolated EP17 polypeptide. A compound isselected that demonstrates specific binding to the isolated EP17polypeptide. Preferably, the EP17 polypeptide used in the binding assayof the method includes some or all amino acids of SEQ ID NO:4.

[0028] The present invention further provides a method for modulatingEP17 function in a subject. According to the method, a pharmaceuticalcomposition is prepared that includes a substance capable of modulatingEP17 expression or function, and a carrier. An effective dose of thepharmaceutical composition is administered to a subject, whereby EP17activity is altered in the subject. In a preferred embodiment, thesubstance used to perform this method shows specific binding to some orall amino acids of SEQ ID NO:4 and was discovered by a screening assaymethod of the present invention. In another embodiment, EP17 function isdisrupted by immunizing a subject with an effective dose of thedisclosed EP17 polypeptide. The immune system of the subject produces anantibody that specifically recognizes the—EP17 polypeptide, andpreferably recognizes some or all of amino acids of SEQ ID NO:4. In afurther embodiment, a gene therapy vector is used, the vector comprisinga nucleotide sequence encoding an EP17 polypeptide. Alternatively, thegene therapy vector comprises a nucleotide sequence encoding a nucleicacid molecule, a peptide, or a protein that interacts with an EP17nucleic acid or polypeptide. Preferably, the subject is a human subject.

[0029] A method is also provided for expressing a nucleotide sequence ofinterest in epididymis using an EP17 promoter region. According to themethod, a gene therapy vector is prepared comprising an EP17 promoterregion operably linked to a nucleotide sequence of interest. A genetherapy vector so-constructed is administered to a subject, whereby thenucleotide sequence of interest is expressed in epididymis. Preferably,the EP17 promoter comprises SE ID NO:5, or functional portion thereof.Also preferably, the subject is a human subject.

[0030] The invention further provides a method for diminishing thefertile capacity of a subject. According to the method, a chemicalcompound, peptide, or antibody that interacts with an EP17 polypeptideis identified. Preferably, the polypeptide is the sequence of SEQ IDNO:4 or 6. A pharmaceutical preparation is prepared comprising such achemical compound, peptide, or antibody, and a carrier. An effectivedose of the pharmaceutical composition is administered to a subject,whereby the fertile capacity of the subject is diminished.

[0031] The invention further provides a method for promoting the fertilecapacity of a subject. In this case, a chemical compound or peptide thatinteracts with an EP17 polypeptide is identified. Preferably, thepolypeptide is the sequence of SEQ ID NO:4 or 6. A pharmaceuticalcomposition comprising the chemical compound or peptide and a carrier isprepared. An effective dose of the pharmaceutical composition isadministered to a subject, whereby the fertile capacity of the subjectis improved.

[0032] Accordingly, it is an object of the present invention to providenovel EP17 nucleic acid and polypeptide sequences, and novel methodsrelating thereto. This object is achieved in whole or in part by thepresent invention.

[0033] An object of the invention having been stated above, otherobjects and advantages of the present invention will become apparent tothose skilled in the art after a study of the following description ofthe invention, Figures and non-limiting Examples.

BRIEF DESCRIPTION OF THE DRAWINGS

[0034]FIG. 1 depicts genomic organization of the mEP17 gene. mEP17 islocated upstream from mE-RABP within the locus[A3,B] of the mousechromosome 2. Exon sizes are indicated in nucleotides. The majortranscription initiation sites of both genes are represented with brokenarrows. Primer “FwmEP17cDNA” (SEQ ID NO:7) was used for primer extensionanalysis. Two motifs G-X—W and T-D-Y and two cysteine residues (C) thatcontribute to the three dimensional structure of lipocalin proteins arealso indicated.

[0035]FIG. 2A presents a Northern blot showing epididymis-specificexpression of the mEP17 gene. Total RNA was extracted from individualtissues and hybridized with [³²P]-labeled mEP17 cDNA. Two majortranscripts of 1 kb and 3.1 kb in size were detected only in theepididymis.

[0036]FIG. 2B shows Northern blot analysis of total RNA extracted fromthe epididymis, hybridized with [³²P]-labeled intron 1 of the mEP17 geneor with [³²P]-labeled mEP17 cDNA used as probes. The intron 1 probe onlydetected the 3.1 kb transcripts, suggesting that these transcripts arelikely unspliced mEP17 precursor RNA.

[0037]FIG. 3 shows region-specific expression of the mEP17 gene in theinitial segment of the epididymis. In situ hybridization of mEP17transcripts is detected in the initial segment (IS) but not in theefferent duct (ED) and mid/distal caput epididymis (Cp).

[0038]FIGS. 4A and 4B show in situ hybridization of mEP17 in epididymaltissue, and also show cell-specific expression of the mEP17 gene.

[0039]FIG. 4A shows a high magnification view of the boxed region ofFIG. 3 at the boundary between the initial (IS) and proximal caputepididymis (Cp). mEP17 mRNA is highly expressed only in the principalcells of the initial segment (IS). No staining is observed in theconjunctive tissue (CT) and in the epithelial cells of the proximalcaput epididymis (Cp).

[0040]FIG. 4B shows hybridization of a section of the initial segmentwith a sense strand digoxygenin-labeled mEP17 RNA. No signal isdetected.

[0041]FIG. 5 presents a comparison of the genomic structure of the mouseand human EP17 genes. The major transcription initiation sites (TIS) ofboth genes are indicated by broken arrows. The lipocalin-specific motifs(G-X—W, T-D-Y, and 2 cysteine residues) are also indicated. Black boxesindicate exons, and the line region between the boxes indicate introns.Numbers below the boxes and line regions indicate exon and intron sizesin base pairs.

[0042]FIG. 6 presents BLAST results using human EP17 cDNA sequence ofSEQ ID NO:3 as the query sequence. The highest homologies were observedwith other epididymal lipocalins (E-RABP and prostaglandin H₂-Disomerases).

[0043]FIG. 7 presents a comparison of the amino acid sequences of mouseEP17 and human EP17 proteins. Conserved lipocalin motifs are indicated.The mouse and human EP17 proteins share 61% overall identity.

[0044]FIG. 8 shows hydropathic analysis of the murine and human EP17proteins.

[0045]FIG. 9 depicts a restriction enzyme map of plasmids derived fromBAC clone 10983 aligned with the mEP17/mE-RABP genomic region. Thepromoter region of mEP17, mEP17 exons, the intergenic region, andmE-RABP exons are indicated.

[0046]FIG. 10 shows primer extension analysis of the 5′ end of mEP17mRNA. Total RNA extracted from the epididymis (Ep) or transfer (t) RNAwas reverse transcribed with [³²P]-radiolabeled mEP17PE2 primer (SEQ IDNO:7) and extended using Avian Myeloblastosis virus (AMV) reversetranscriptase. Lanes labeled “C”, “T”, “A” and “G” are[³⁵S]-radiolabeled DNA sequencing reactions carried out using themEP17PE2 primer (SEQ ID NO:7) and the pHindIII clone (indicated in FIG.9) as template. The localization of two major (arrows) and two minor(arrowheads) transcription initiation sites are indicated.

[0047]FIG. 11 shows the nucleotide sequence of the mEP17 5.3 kb promoterregion. Putative cis-DNA regulatory elements within the 5′ flankingregion are underlined, including binding sites for androgen receptor(ARSB), retinoic acid receptor (RARE), Stimulating Protein 1 (SP-1),Activator Protein 1 (AP-1), Octamer transcription factor 1 (Oct-1), andSox-5 (SRY-related Sequence #5 Protein). A consensus TATA box isindicated.

[0048]FIG. 12 depicts constructs used in functional assays of the mEP17promoter, each construct comprising a different fragment of the 5.3 kbmEP17 promoter region (solid lines), an open reading frame encoding anexemplary reporter gene (chloramphenicol acetyltransferase, solid barlabeled “CAT”) operably linked to a promoter fragment, and the polyAtail region of Simian virus 40 large T antigen.

[0049]FIG. 13 shows the nucleotide sequence of the human EP17 promoterregion. Putative cis-DNA regulatory elements are underlined, including aStimulatory Protein 1 (Sp-1) binding site, an Activator Protein 1 (AP-1)binding site, a cAMP response element binding protein (CREB) site, aSRY-related HMG-box gene (Sox5) binding site, a Sex-determining region Ygene product (SRY) binding site, a c-Ets binding site, a GATA bindingsite, and an Octamer transcription factor 1 (Oct-1) binding site.

[0050]FIG. 14 presents a comparison of the putative cis-DNA regulatoryelements in the mouse and human EP17 promoter regions.

[0051]FIGS. 15A, 15B, and 15C present experiments demonstrating hormonalregulation of mEP17 transcription.

[0052]FIG. 15A shows Northern blot analysis of epididymal total RNA (10μg/lane) extracted from intact (I) and castrated animals at 5, 10 , 20,and 30 days following castration (C5, C10, C20 and C30 respectively),hybridized with [³²P]-labeled mEP17 cDNA.

[0053]FIG. 15B shows Northern blot analysis to detect mEP17 RNA fourdays following hemicastration. Levels of mEP17 RNA in the epididymis ofthe castrated side (HI) are reduced to 0.7% of RNA levels in theepididymis of the non-castrated side (HC).

[0054]FIG. 15C shows Northern blot analysis to detect mEP17 cDNA 5 daysafter castration (C5) and 5 days after castration and androgenreplacement (P).

[0055]FIG. 16 depicts homologous recombination at the mEP17 locus usingthe pLN-17 vector. The mEP17/mE-RABP genomic region is presented at thetop. mEP17 exons are indicated by hatched rectangles. mE-RABP exons areindicated by open rectangles. The mEP17 targeting plasmid pLN-17 isdesigned so that 1.4 kb of mEP17 5′ flanking region is positionedimmediately upstream of the vector PGK neomycin sequence, and 10.9 kb ofmEP17 3′ flanking region and mE-RABP gene is positioned immediatelydownstream of the vector PGK neomycin sequence. mEP17 sequences carriedin the pLN-17 targeting vector mediate homologous recombination,depicted as an “X” between the genomic region and the targeting plasmid.The recombination event creates a genomic reorganization wherein theentire mEP17 coding sequence is replaced by the PGK neomycin sequence.

[0056]FIG. 17 shows recombinant production of mEP17 protein using thepBAD/gIII vector (Invitrogen). Protein extracted from E.coli transformedwith pBAD/gIII-mEP17 is resolved by polyacrylamide gel electrophoresis.

[0057]FIG. 17A shows Coomassie blue staining that identifies twoenriched protein species (boxed).

[0058]FIG. 17B shows Western blot analysis using an anti-his tagantibody to detect two recombinant proteins of approximately 21 and 23kDa, corresponding to the processed and non-processed mEP17 isoforms.

DETAILED DESCRIPTION OF THE INVENTION

[0059] The present invention provides isolated nucleic acids comprisinga lipocalin gene promoter region (representative embodiments set forthas SEQ ID NOs:1 and 5), isolated nucleic acids comprising a humanlipocalin gene (a representative embodiment set forth as SEQ ID NO:2),isolated nucleic acids encoding a lipocalin polypeptide (arepresentative embodiment set forth as SEQ ID NO:3), isolated lipocalinpolypeptides (a representative embodiment set forth as SEQ ID NO:4), anduses thereof. The disclosed lipocalin nucleic acids and polypeptides canbe used according to methods of the present invention to generate amouse model of male infertility, for drug discovery screens, and fortherapeutic treatment of fertility-related conditions, among other uses.

I. Definitions

[0060] While the following terms are believed to be well understood byone of ordinary skill in the art, the following definitions are setforth to facilitate explanation of the invention. The entire contents ofall publications mentioned herein, including the discussion of thebackground art presented above, are hereby fully incorporated byreference.

I.A. EP17 Nucleic Acids

[0061] The nucleic acid molecules provided by the present inventioninclude the isolated nucleic acid molecules of SEQ ID NOs:1, 2, 3, and5, sequences substantially similar to sequences of SEQ ID NOs:1, 2, 3,and 5, conservative variants thereof, subsequences and elongatedsequences thereof, complementary DNA molecules, and corresponding RNAmolecules. The present invention also encompasses genes, cDNAs, chimericgenes, and vectors comprising disclosed EP17 nucleic acid sequences.

[0062] The term “nucleic acid molecule” refers to deoxyribonucleotidesor ribonucleotides and polymers thereof in either single- ordouble-stranded form. Unless specifically limited, the term encompassesnucleic acids containing known analogues of natural nucleotides whichhave similar properties as the reference natural nucleic acid. Unlessotherwise indicated, a particular nucleotide sequence also implicitlyencompasses conservatively modified variants thereof (e.g. degeneratecodon substitutions), complementary sequences, subsequences, elongatedsequences, as well as the sequence explicitly indicated. The terms“nucleic acid molecule” or “nucleotide sequence” can also be used inplace of “gene”, “cDNA”, or “mRNA”. Nucleic acids can be derived fromany source, including any organism.

[0063] The term “isolated”, as used in the context of a nucleic acidmolecule, indicates that the nucleic acid molecule exists apart from itsnative environment and is not a product of nature. An isolated DNAmolecule can exist in a purified form or can exist in a non-nativeenvironment such as a transgenic host cell.

[0064] The term “Purified”, when applied to a nucleic acid, denotes thatthe nucleic acid is essentially free of other cellular components withwhich it is associated in the natural state. Preferably, a purifiednucleic acid molecule is a homogeneous dry or aqueous solution. The term“purified” denotes that a nucleic acid or protein gives rise toessentially one band in an electrophoretic gel. Particularly, it meansthat the nucleic acid is at least about 50% pure, more preferably atleast about 85% pure, and most preferably at least about 99% pure.

[0065] The term “substantially identical”, the context of two nucleotideor amino acid sequences, can also be defined as two or more sequences orsubsequences that have at least 60%, preferably 80%, more preferably90-95%, and most preferably at least 99% nucleotide or amino acidsequence identity, when compared and aligned for maximum correspondence,as measured using one of the following sequence comparison algorithms(described herein below under the heading Nucleotide and Amino AcidSequence Comparisons) or by visual inspection. Preferably, thesubstantial identity exists in nucleotide sequences of at least 50residues, more preferably in nucleotide sequence of at least about 100residues, more preferably in nucleotide sequences of at least about 150residues, and most preferably in nucleotide sequences comprisingcomplete coding sequences. In one aspect, polymorphic sequences can besubstantially identical sequences. The term “polymorphic” refers to theoccurrence of two or more genetically determined alternative sequencesor alleles in a population. An allelic difference can be as small as onebase pair.

[0066] Another indication that two nucleotide sequences aresubstantially identical is that the two molecules specifically orsubstantially hybridize to each other under stringent conditions. In thecontext of nucleic acid hybridization, two nucleic acid sequences beingcompared can be designated a “probe” and a “target”. A “probe” is areference nucleic acid molecule, and a “target” is a test nucleic acidmolecule, often found within a heterogenous population of nucleic acidmolecules. A “target sequence” is synonymous with a “test sequence”.

[0067] A preferred nucleotide sequence employed for hybridizationstudies or assays includes probe sequences that are complementary to ormimic at least an about 14 to 40 nucleotide sequence of a nucleic acidmolecule of the present invention. Preferably, probes comprise 14 to 20nucleotides, or even longer where desired, such as 30, 40, 50, 60, 100,200, 300, or 500 nucleotides or up to the full length of any of SEQ IDNOs:1, 2, 3, and 5. Such fragments can be readily prepared by, forexample, directly synthesizing the fragment by chemical synthesis, byapplication of nucleic acid amplification technology, or by introducingselected sequences into recombinant vectors for recombinant production.The phrase “hybridizing specifically to” refers to the binding,duplexing, or hybridizing of a molecule only to a particular nucleotidesequence under stringent conditions when that sequence is present in acomplex nucleic acid mixture (e.g., total cellular DNA or RNA). Thephrase “binds substantially to” refers to complementary hybridizationbetween a probe nucleic acid molecule and a target nucleic acid moleculeand embraces minor mismatches that can be accommodated by reducing thestringency of the hybridization media to achieve the desiredhybridization.

[0068] “Stringent hybridization conditions” and “stringent hybridizationwash conditions” in the context of nucleic acid hybridizationexperiments such as Southern and Northern blot analysis are bothsequence- and environment-dependent. Longer sequences hybridizespecifically at higher temperatures. An extensive guide to thehybridization of nucleic acids is found in Tijssen (1993) “LaboratoryTechniques in Biochemistry and Molecular Biology-Hybridization withNucleic Acid Probes” part I chapter 2, Elsevier, New York. Generally,highly stringent hybridization and wash conditions are selected to beabout 5 C lower than the thermal melting point (T_(m)) for the specificsequence at a defined ionic strength and pH. Typically, under “stringentconditions” a probe will hybridize specifically to its targetsubsequence, but to no other sequences.

[0069] The T_(m) is the temperature (underdefined ionic strength and pH)at which 50% of the target sequence hybridizes to a perfectly matchedprobe. Very stringent conditions are selected to be equal to the T_(m)for a particular probe. An example of stringent hybridization conditionsfor Southern or Northern Blot analysis of complementary nucleic acidshaving more than about 100 complementary residues is overnighthybridization in 50% formamide with 1 mg of heparin at 42° C. An exampleof highly stringent wash conditions is 15 minutes in 0.15 M NaCl at 65°C. An example of stringent wash conditions is 15 minutes in 0.2×SSCbuffer at 65° C. (See Sambrook (1989) for a description of SSC buffer).Often, a high stringency wash is preceded by a low stringency wash toremove background probe signal. An example of medium stringency washconditions for a duplex of more than about 100 nucleotides, is 15minutes in 1×SSC at 45° C. An example of low stringency wash for aduplex of more than about 100 nucleotides, is 15 minutes in 4-6×SSC at40° C. For short probes (e.g., about 10 to 50 nucleotides), stringentconditions typically involve salt concentrations of less than about 1.0M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or othersalts) at pH 7.0-8.3, and the temperature is typically at least about30° C. Stringent conditions can also be achieved with the addition ofdestabilizing agents such as formamide. In general, a signal to noiseratio of 2-fold (or higher) than that observed for an unrelated probe inthe particular hybridization assay indicates detection of a specifichybridization.

[0070] The following are examples of hybridization and wash conditionsthat can be used to clone homologous nucleotide sequences that aresubstantially identical to reference nucleotide sequences of the presentinvention: a probe nucleotide sequence preferably hybridizes to a targetnucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1mM EDTA at 50° C. followed by washing in 2×SSC, 0.1% SDS at 50° C.; morepreferably, a probe and target sequence hybridize in 7% sodium dodecylsulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. followed by washing in1×SSC, 0.1% SDS at 50° C.; more preferably, a probe and target sequencehybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at50° C. followed by washing in 0.5×SSC, 0.1% SDS at 50° C.; morepreferably, a probe and target sequence hybridize in 7% sodium dodecylsulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. followed by washing in0.1×SSC, 0.1% SDS at 50° C.; more preferably, a probe and targetsequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mMEDTA at 50° C. followed by washing in 0.1×SSC, 0.1% SDS at 65° C.

[0071] A further indication that two nucleic acid sequences aresubstantially identical is that proteins encoded by the nucleic acidsare substantially identical, share an overall three-dimensionalstructure, are biologically functional equivalents; or areimmunologically cross-reactive. These terms are defined further underthe heading EP17 Polypeptides herein below. Nucleic acid molecules thatdo not hybridize to each other under stringent conditions are stillsubstantially identical if the corresponding proteins are substantiallyidentical. This can occur, for example, when two nucleotide sequencesare significantly degenerate as permitted by the genetic code.

[0072] The term “conservatively substituted variants” refers to nucleicacid sequences having degenerate codon substitutions wherein the thirdposition of one or more selected (or all) codons is substituted withmixed-base and/or deoxyinosine residues (Batzer et al. (1991) NucleicAcid Res. 19:5081; Ohtsuka et al. (1985) J Biol Chem 260:2605-2608;Rossolini et al. (1994) Mol Cell Probes 8:91-98).

[0073] The term “subsequence” refers to a sequence of nucleic acids thatcomprises a part of a longer nucleic acid sequence. An exemplarysubsequence is a probe, described herein above, or a primer. The term“primer” as used herein refers to a contiguous sequence comprising about8 or more deoxyribonucleotides or ribonucleotides, preferably 10-20nucleotides, and more preferably 20-30. nucleotides of a selectednucleic acid molecule. The primers of the invention encompassoligonucleotides of sufficient length and appropriate sequence so as toprovide initiation of polymerization on a nucleic acid molecule of thepresent invention.

[0074] The term “elongated sequence” refers to an addition ofnucleotides (or other analogous molecules) incorporated into the nucleicacid. For example, a polymerase (e.g., a DNA polymerase), .g., apolymerase which adds sequences at the 3′ terminus of the nucleic acidmolecule. In addition, the nucleotide sequence can be combined withother DNA sequences, such as promoters, promoter regions, enhancers,polyadenylation signals, intronic sequences, additional restrictionenzyme sites, multiple cloning sites, and other coding segments.

[0075] The term “complementary sequence”, as used herein, indicates twonucleotide sequences that comprise antiparallel nucleotide sequencescapable of pairing with one another upon formation of hydrogen bondsbetween base pairs. As used herein, the term “complementary sequences”means nucleotide sequences which are substantially complementary, as canbe assessed by the same nucleotide comparison set forth above, or isdefined as being capable of hybridizing to the nucleic acid segment inquestion under relatively stringent conditions such as those describedherein. A particular example of a complementary nucleic acid segment isan antisense oligonucleotide.

[0076] The term “gene” refers broadly to any segment of DNA associatedwith a biological function. A gene encompasses sequences including butnot limited to a coding sequence, a promoter region, a cis-regulatorysequence, a non-expressed DNA segment is a specific recognition sequencefor regulatory proteins, a non-expressed DNA segment that contributes togene expression, a DNA segment designed to have desired parameters, orcombinations thereof. A gene can be obtained by a variety of methods,including cloning from a biological sample, synthesis based on known orpredicted sequence information, and recombinant derivation of anexisting sequence.

[0077] The term “promoter region” defines a nucleotide sequence within agene that is positioned 5′ to a coding sequence of a same gene andfunctions to direct transcription of the coding sequence. The promoterregion includes a transcriptional start site and at least onecis-regulatory element. The present invention encompasses nucleic acidsequences that comprise a promoter region of an EP17 gene, or functionalportion thereof.

[0078] The term “cis-acting regulatory sequence” or “cis-regulatorymotif” or “response element”, as used herein, each refer to a nucleotidesequence that enables responsiveness to a regulatory transcriptionfactor. Responsiveness can encompass a decrease or an increase intranscriptional output and is mediated by binding of the transcriptionfactor to the DNA molecule comprising the response element.

[0079] The term “transcription factor” generally refers to a proteinthat modulates gene expression by interaction with the cis-regulatoryelement and cellular components for transcription, including RNAPolymerase, Transcription Associated Factors (TAFs),chromatin-remodeling proteins, and any other relevant protein thatimpacts gene transcription.

[0080] The term “gene expression” generally refers to the cellularprocesses by which a biologically active polypeptide is produced from aDNA sequence.

[0081] A “functional portion” of a promoter gene fragment is anucleotide sequence within a promoter region that is required for normalgene transcription. To determine nucleotide sequences that arefunctional, the expression of a reporter gene is assayed when variablyplaced under the direction of a promoter region fragment.

[0082] Promoter region fragments can be conveniently made by enzymaticdigestion of a larger fragment using restriction endonucleases or DNAseI. Preferably, a functional promoter region fragment comprises about5000 nucleotides, more preferably 2000 nucleotides, more preferablyabout1000 nucleotides, more preferably a functional promoter regionfragment comprises about 500 nucleotides, even more preferably afunctional promoter region fragment comprises about 100 nucleotides, andeven more preferably a functional promoter region fragment comprisesabout 20 nucleotides.

[0083] The terms “reporter gene” or “marker gene” or “selectable marker”each refer to a heterologous gene encoding a product that is readilyobserved and/or quantitated. A reporter gene is heterologous in that itoriginates from a source foreign to an intended host cell or, if fromthe same source, is modified from its original form. Non-limitingexamples of detectable reporter genes that can be operably linked to atranscriptional regulatory region can be found in Alam and Cook (1990)Anal Biochem 188:245-254 and PCT International Publication No. WO97/47763. Preferred reporter genes for transcriptional analyses includethe lacZ gene (See, e.g., Rose and Botstein (1983) Meth Enzymol101:167-180), Green Fluorescent Protein (GFP) (Cubitt et al. (1995)Trends Biochem Sci 20:448455), luciferase, or chloramphenicol acetyltransferase (CAT). Preferred reporter genes for methods to producetransgenic animals include but are not limited to antibiotic resistancegenes, and more preferably the antibiotic resistance gene confersneomycin resistance. Any suitable reporter and detection method can beused, and it will be appreciated by one of skill in the art that noparticular choice is essential to or a limitation of the presentinvention.

[0084] An amount of reporter gene can be assayed by any method forqualitatively or preferably, quantitatively determining presence oractivity of the reporter gene product. The amount of reporter geneexpression directed by each test promoter region fragment is compared toan amount of reporter gene expression to a control construct comprisingthe reporter gene in the absence of a promoter region fragment. Apromoter region fragment is identified as having promoter activity whenthere is significant increase in an amount of reporter gene expressionin a test construct as compared to a control construct. The term“significant increase”, as used herein, refers to an quantified changein a measurable quality that is larger than the margin of error inherentin the measurement technique, preferably an increase by about 2-fold orgreater relative to a control measurement, more preferably an increaseby about 5-fold or greater, and most preferably an increase by about10-fold or greater.

[0085] The present invention also encompasses chimeric genes comprisingthe disclosed EP17 sequences. The term “chimeric gene”, as used herein,refers to an EP17 promoter region operably linked to an open readingframe, wherein the nucleotide sequence created is not naturallyoccurring. In this regard, the open reading frame is also described as a“heterologous sequence”. The term “chimeric gene” also encompasses apromoter region operably linked to an EP17 coding sequence, a nucleotidesequence producing an antisense RNA molecule, a RNA molecule havingtertiary structure, such as a hairpin structure, or a double-strandedRNA molecule.

[0086] The term “operably linked”, as used herein, refers to a promoterregion that is connected to a nucleotide sequence in such a way that thetranscription of that nucleotide sequence is controlled and regulated bythat promoter region. Techniques for operatively linking a promoterregion to a nucleotide sequence are well known in the art.

[0087] The terms “heterologous gene”, “heterologous DNA sequence”,“heterologous nucleotide sequence”, “exogenous nucleic acid molecule”,or “exogenous DNA segment”, as used herein, each refer to a sequencethat originates from a source foreign to an intended host cell or, iffrom the same source, is modified from its original form. Thus, aheterologous gene in a host cell includes a gene that is endogenous tothe particular host cell but has been modified, for example bymutagenesis or by isolation from native cis-regulatory sequences. Theterms also includes non-naturally occurring multiple copies of anaturally occurring nucleotide sequence. Thus, the terms refer to a DNAsegment that is foreign or heterologous to the cell, or homologous tothe cell but in a position within the host cell nucleic acid wherein theelement is not ordinarily found.

[0088] The present invention further includes vectors comprising thedisclosed EP17 sequences, including plasmids, cosmids, and viralvectors. The term “vector”, as used herein refers to a DNA moleculehaving sequences that enable its replication in a compatible host cell.A vector also includes nucleotide sequences to permit ligation ofnucleotide sequences within the vector, wherein such nucleotidesequences are also replicated in a compatible host cell. A vector canalso mediate recombinant production of an EP17 polypeptide, as describedfurther herein below. Preferred vectors include but are not limited topBluescript (Stratagene), pUC18, pBLCAT3 (Luckow and Schutz (1987)Nucleic Acids Res 15:5490), pLNTK (Gorman et al. (1996) Immunity5:241-252), and pBAD/gIII (Stratagene). A preferred host cell is amammalian cell; more preferably the cell is a Chinese hamster ovarycell, a HeLa cell, a baby hamster kidney cell, or a mouse cell; morepreferably the cell is a mouse epididymal cell; even more preferably thecell is a human cell.

[0089] Nucleic acids of the present invention can be cloned,synthesized, recombinantly altered, mutagenized, or combinationsthereof. Standard recombinant DNA and molecular cloning techniques usedto isolate nucleic acids are well known in the art. Exemplary,non-limiting methods are described by Sambrook et al., eds. (1989)“Molecular Cloning”, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.; by Silhavy et al. (1984) “Experiments with Gene Fusions”,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; byAusubel et al. (1992) Current Protocols in Molecular Biology John Wylieand Sons, Inc. New York; and by Glover, ed. (1985) “DNA Cloning: APractical Approach”, MRL Press, Ltd., Oxford, U.K. Site-specificmutagenesis to create base pair changes, deletions, or small insertionsare also well known in the art as exemplified by publications, see.e.g., Adelman et al., (1983) DNA 2:183; Sambrook et al. (1989).

[0090] Sequences detected by methods of the invention can be detected,subcloned, sequenced, and further evaluated by any measure well known inthe art using any method usually applied to the detection of a specificDNA sequence including but not limited to dideoxy sequencing, PCR,oligomer restriction (Saiki et al., Bio/Technology 3:1008-1012 (1985),allele-specific oligonucleotide (ASO) probe analysis (Conner et al.(1983) Proc. Natl. Acad. Sci. U.S.A. 80:278), and oligonucleotideligation assays (OLAs) (Landgren et. al. (1988) Science 241:1007).Molecular techniques for DNA analysis have been reviewed (Landgren et.al. (1988) Science 242:229-237).

I.B. EP17 Polypeptides

[0091] The polypeptides provided by the present invention include theisolated polypeptide of SEQ ID NO:4, polypeptides substantially similarto sequences of SEQ ID NO:4, EP17 polypeptide fragments, fusion proteinscomprising EP17 amino acid sequences, biologically functional analogs,and polypeptides that cross-react with an antibody that specificallyrecognizes an EP17 polypeptide.

[0092] The term “isolated”, as used in the context of a polypeptide,indicates that the polypeptide exists apart from its native environmentand is not a product of nature. An isolated polypeptide can exist in apurified form or can exist in a non-native environment such as, forexample, in a transgenic host cell.

[0093] The term “purified”, when applied to a polypeptide, denotes thatthe polypeptide is essentially free of other cellular components withwhich it is associated in the natural state. Preferably, a polypeptideis a homogeneous solid or aqueous solution. Purity and homogeneity aretypically determined using analytical chemistry techniques such aspolyacrylamide gel electrophoresis or high performance liquidchromatography. A polypeptide which is the predominant species presentin a preparation is substantially purified. The term “purified” denotesthat a polypeptide gives rise to essentially one band in anelectrophoretic gel. Particularly, it means that the polypeptide is atleast about 50% pure, more preferably at least about 85% pure, and mostpreferably at least about 99% pure.

[0094] The term “substantially identical” in the context of two or morepolypeptides sequences is measured by (a) polypeptide sequences havingabout 35%, or 45%, or preferably from 45-55%, or more preferably 55-66%,or most preferably 65% or greater amino acids which are identical. orfunctionally equivalent. Percent “identity” and methods for determiningidentity are defined herein below under the heading Nucleotide and AminoAcid Sequence Comparisons.

[0095] Substantially identical polypeptides also encompass two or morepolypeptides sharing a conserved three-dimensional structure.Computational methods can be used to compare structural representations,and structural superpositions can be generated and easily tuned toidentify similarities around important active sites or ligand bindingsites. See Henikoff et al. (2000) Electrophoresis 21(9):1700-1706; Huanget al. (2000) Pac Symp Biocomput 230-241; Saqi et al. (1999)Bioinformatics 15(6):521-522; and Barton (1998) Acta Crystallogr D BiolCrystallogr 54:1139-1146.

[0096] The term “functionally equivalent” in the context of amino acidsequences is well known in the art and is based on the relativesimilarity of the amino acid side-chain substituents. See Henikoff andHenikoff (2000) Adv Protein Chem 54:73-97. Relevant factors forconsideration include side-chain hydrophobicity, hydrophilicity, charge,and size. For example, arginine, lysine, and histidine are allpositively charged residues; that alanine, glycine, and serine are allof similar size; and that phenylalanine, tryptophan, and tyrosine allhave a generally similar shape. By this analysis, described furtherherein below, arginine, lysine, and histidine; alanine, glycine, andserine; and phenylalanine, tryptophan, and tyrosine; are defined hereinas biologically functional equivalents.

[0097] In making biologically functional equivalent amino acidsubstitutions, the hydropathic index of amino acids can be considered.Each amino acid has been assigned a hydropathic index on the basis oftheir hydrophobicity and charge characteristics, these are: isoleucine(+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine(+2.5); methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine(−0.7); serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline(−1.6); histidine (−3.2); glutamate (−3.5); glutamine (−3.5); aspartate(−3.5); asparagine (−3.5); lysine (−3.9); and arginine (−4.5).

[0098] The importance of the hydropathic amino acid index in conferringinteractive biological function on a protein is generally understood inthe art (Kyte et al. (1982) J Mol Biol 157:105.). It is known thatcertain amino acids can be substituted for other amino acids having asimilar hydropathic index or score and still retain a similar biologicalactivity. In making changes based upon the hydropathic index, thesubstitution of amino acids whose hydropathic indices are within ±2 ofthe original value is preferred, those which are within ±1 of theoriginal value are particularly preferred, and those within ±0.5 of theoriginal value are even more particularly preferred.

[0099] It is also understood in the art that the substitution of likeamino acids can be made effectively on the basis of hydrophilicity. U.S.Pat. No. 4,554,101 states that the greatest local average hydrophilicityof a protein, as governed by the hydrophilicity of its adjacent aminoacids, correlates with its immunogenicity and antigenicity, i.e. with abiological property of the protein. It is understood that an amino acidcan be substituted for another having a similar hydrophilicity value andstill obtain a biologically equivalent protein.

[0100] As detailed in U.S. Pat. No. 4,554,101, the followinghydrophilicity values have been assigned to amino acid residues:arginine (+3.0); lysine (+3.0); aspartate (+3.0±1); glutamate (+3.0±1);serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0);threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5);cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8);isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); tryptophan(−3.4).

[0101] In making changes based upon similar hydrophilicity values, thesubstitution of amino acids whose hydrophilicity values are within ±2 ofthe original value is preferred, those which are within ±1 of theoriginal value are particularly preferred, and those within ±0.5 of theoriginal value are even more particularly preferred.

[0102] The present invention also encompasses EP17 polypeptide fragmentsor functional portions of an EP17 polypeptide. Such functional portionneed not comprise all or substantially all of the amino acid sequence ofa native lipocalin gene product. The term “functional” includes anybiological activity or feature of EP17, including immunogenicity.

[0103] The present invention also includes longer sequences an EP17polypeptide, or portion thereof. For example, one or more amino acidscan be added to the N-terminal or C-terminal of an EP17 polypeptide.Fusion proteins comprising EP17 polypeptide sequences are also providedwithin the scope of the present invention. Methods of preparing suchproteins are known in the art.

[0104] The present invention also encompasses functional analogs of anEP17 polypeptide. Functional analogs share at least one biologicalfunction with an EP17 polypeptide. An exemplary function isimmunogenicity. In the context of amino acid sequence, biologicallyfunctional analogs, as used herein, are peptides in which certain, butnot most or all, of the amino acids can be substituted. Functionalanalogs can be created at the level of the corresponding nucleic acidmolecule, altering such sequence to encode desired amino acid changes.In one embodiment, changes can be introduced to improve the antigenicityof the protein. In another embodiment, an EP17 polypeptide sequence isvaried so as to assess the activity of a mutant EP17 polypeptide.

[0105] The present invention also encompasses recombinant production ofthe disclosed EP17 polypeptides. Briefly, a nucleic acid sequenceencoding an EP17 polypeptide, or portion thereof, is cloned into aexpression cassette, the cassette is introduced into a host organism,where it is recombinantly produced.

[0106] The term “expression cassette” as used herein means a DNAsequence capable of directing expression of a particular nucleotidesequence in an appropriate host cell, comprising a promoter operablylinked to the nucleotide sequence of interest which is operably linkedto termination signals. It also typically comprises sequences requiredfor proper translation of the nucleotide sequence. The expressioncassette comprising the nucleotide sequence of interest can be chimeric.The expression cassette can also be one which is naturally occurring buthas been obtained in a recombinant form useful for heterologousexpression.

[0107] The expression of the nucleotide sequence in the expressioncassette can be under the control of a constitutive promoter or aninducible promoter which initiates transcription only when the host cellis exposed to some particular external stimulus. Exemplary promotersinclude Simian virus 40 early promoter, a long terminal repeat promoterfrom retrovirus, an action promoter, a heat shock promoter, and ametallothionein protein. In the case of a multicellular organism, thepromoter and promoter region can direct expression to a particulartissue or organ or stage of development. Exemplary tissue-specificpromoter regions include a mE-RABP promoter and an EP17 promoter,described herein above. Suitable expression vectors which can be usedinclude, but are not limited to, the following vectors or theirderivatives: human or animal viruses such as vaccinia virus oradenovirus, yeast vectors, bacteriophage vectors (e.g., lambda phage),and plasmid and cosmid DNA vectors.

[0108] The term “host cell”, as used herein, refers to a cell into whicha heterologous nucleic acid molecule has been introduced. Transformedcells, tissues, or organisms are understood to encompass not only theend product of a transformation process, but also transgenic progenythereof.

[0109] A host cell strain can be chosen which modulates the expressionof the inserted sequences, or modifies and processes the gene product inthe specific fashion desired. For example, different host cells havecharacteristic and specific mechanisms for the translational andpost-translational processing and modification (e.g., glycosylation,phosphorylation of proteins). Appropriate cell lines or host systems canbe chosen to ensure the desired modification and processing of theforeign protein expressed. Expression in a bacterial system can be usedto produce a non-glycosylated core protein product. Expression in yeastwill produce a glycosylated product. Expression in animal cells can beused to ensure “native” glycosylation of a heterologous protein.

[0110] Expression constructs are transfected into a host cell by anystandard method, including electroporation, calcium phosphateprecipitation, DEAE-Dextran transfection, liposome-mediatedtransfection, and infection using a retrovirus. The EP17-encodingnucleotide sequence carried in the expression construct can be stablyintegrated into the genome of the host or it can be present as anextrachromosomal molecule.

[0111] Isolated polypeptides and recombinantly produced polypeptides canbe purified and characterized using a variety of standard techniquesthat are well known to the skilled artisan. See, e.g. chapter 16 ofAusubel et al. (1992), Bodanszky, et al. (1976) “Peptide Synthesis”,John Wiley and Sons, Second Edition, New York., and Zimmer et al. (1993)“Peptides”pp. 393-394, ESCOM Science Publishers, B. V.

I.C. Nucleotide and Amino Acid Sequence Comparisons

[0112] The terms “identical” or percent “identity” in the context of twoor more nucleotide or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same, whencompared and aligned for maximum correspondence, as measured using oneof the sequence comparison algorithms disclosed herein or by visualinspection.

[0113] The term “substantially identical” in regards to a nucleotide orpolypeptide sequence means that a particular sequence varies from thesequence of a naturally occurring sequence by one or more deletions,substitutions, or additions, the net effect of which is to retain atleast some of biological activity of the natural gene, gene product, orsequence. Such sequences include “mutant” sequences, or sequenceswherein the biological activity is altered to some degree but retains atleast some of the original biological activity. The term “naturallyoccurring”, as used herein, is used to describe a composition that canbe found in nature as distinct from being artificially produced by man.For example, a protein or nucleotide sequence present in an organism,which can be isolated from a source in nature and which has not beenintentionally modified by man in the laboratory, is naturally occurring.

[0114] For sequence comparison, typically one sequence acts as areference sequence to which test sequences are compared. When using asequence comparison algorithm, test and reference sequences are enteredinto a computer program, subsequence coordinates are designated ifnecessary, and sequence algorithm program parameters are selected. Thesequence comparison algorithm then calculates the percent sequenceidentity for the designated test sequence(s) relative to the referencesequence, based on the selected program parameters.

[0115] Optimal alignment of sequences for comparison can be conducted,e.g., by the local homology algorithm of Smith and Waterman (1981) AdvAppl Math 2:482, by the homology alignment algorithm of Needleman andWunsch (1970) J Mol Biol 48:443, by the search for similarity method ofPearson and Lipman (1988) Proc Natl Acad Sci USA 85:2444, bycomputerized implementations of these algorithms (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, Madison, Wis.), or by visual inspection (See generally, Ausubelet al. (1992)).

[0116] A preferred algorithm for determining percent sequence identityand sequence similarity is the BLAST algorithm, which is described inAltschul et al. (1990) J Mol Biol 215: 403-410. Software for performingBLAST analyses is publicly available through the National Center forBiotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithminvolves first identifying high scoring sequence pairs (HSPs) byidentifying short words of length W in the query sequence, which eithermatch or satisfy some positive-valued threshold score T when alignedwith a word of the same length in a database sequence. T is referred toas the neighborhood word score threshold. These initial neighborhoodword hits act as seeds for initiating searches to find longer HSPscontaining them. The word hits are then extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Cumulative scores are calculated using, for nucleotidesequences, the parameters M (reward score for a pair of matchingresidues; always>0) and N (penalty score for mismatching residues;always<0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when the cumulative alignment score falls off bythe quantity X from its maximum achieved value, the cumulative scoregoes to zero or below due to the accumulation of one or morenegative-scoring residue alignments, or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength W=11, an expectationE10, a cutoff of 100, M=5, N=−4, and a comparison of both strands. Foramino acid sequences, the BLASTP program uses as defaults a wordlength(W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. SeeHenikoff and Henikoff (1989) Proc Natl Aced Sci USA 89:10915.

[0117] In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences. See. e.g., Karlin and Altschul (1993) Proc Natl Acad SciUSA 90:5873-5887. One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a test nucleicacid sequence is considered similar to a reference sequence if thesmallest sum probability in a comparison of the test nucleic acidsequence to the reference nucleic acid sequence is less than about 0.1,more preferably less than about 0.01, and most preferably less thanabout 0.001.

I.D. Antibodies

[0118] The present invention also provides an antibody immunoreactivewith an EP17 polypeptide. The term “antibody” indicates animmunoglobulin protein, or functional portion thereof, including apolyclonal antibody, a monoclonal antibody, a chimeric antibody, asingle chain antibody, Fab fragments, and an Fab expression library.“Functional portion” refers to the part of the protein that binds amolecule of interest. In a preferred embodiment, an antibody of theinvention is a monoclonal antibody. Techniques for preparing andcharacterizing antibodies are well known in the art (See, e.g., Harlowand Lane (1988) “Antibodies: A Laboratory Manual” Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y.). A monoclonal antibody ofthe present invention can be readily prepared through use of well-knowntechniques such as the hybridoma techniques exemplified in U.S. Pat. No4,196,265 and the phage-displayed techniques disclosed in U.S. Pat. No.5,260,203.

[0119] The phrase “specifically (or selectively) binds to an antibody”,or “specifically (or selectively) immunoreactive with”, when referringto a protein or peptide, refers to a binding reaction which isdeterminative of the presence of the protein in a heterogeneouspopulation of proteins and other biological materials. Thus, underdesignated immunoassay conditions, the specified antibodies bind to aparticular protein and do not show significant binding to other proteinspresent in the sample. Specific binding to an antibody under suchconditions can require an antibody that is selected for its specificityfor a particular protein. For example, antibodies raised to a proteinwith an amino acid sequence encoded by any of the nucleic acid sequencesof the invention can be selected to obtain antibodies specificallyimmunoreactive with that protein and not with unrelated proteins.

[0120] The use of a molecular cloning approach to generate antibodies,particularly monoclonal antibodies, and more particularly single chainmonoclonal antibodies, are also provided. The production of single chainantibodies has been described in the art. See, e.g., U.S. Pat. No.5,260,203. For this approach, combinatorial immunoglobulin phagemidlibraries are prepared from RNA isolated from the spleen of theimmunized animal, and phagemids expressing appropriate antibodies areselected by panning on endothelial tissue. The advantages of thisapproach over conventional hybridoma techniques are that approximately10⁴ times as many antibodies can be produced and screened in a singleround, and that new specificities are generated by heavy (H) and light(L) chain combinations in a single chain, which further increases thechance of finding appropriate antibodies. Thus, an antibody of thepresent invention, or a “derivative” of an antibody of the presentinvention, pertains to a single polypeptide chain binding molecule whichhas binding specificity and affinity substantially similar to thebinding specificity and affinity of the light and heavy chain aggregatevariable region of an antibody described herein.

[0121] The term “immunochemical reaction”, as used herein, refers to anyof a variety of immunoassay formats used to detect antibodiesspecifically bound to a particular protein, including but not limitedto, competitive and non-competitive assay systems using techniques suchas radioimmunoassays, ELISA (enzyme linked immunosorbent assay),“sandwich” immunoassays, immunoradiometric assays, gel diffusionprecipitin reactions, immunodiffusion assays, in situ immunoassays(e.g., using colloidal gold, enzyme or radioisotope labels), westernblots, precipitation reactions, agglutination assays (e.g., gelagglutination assays, hemagglutination assays), complement fixationassays, immunofluorescence assays, protein A assays, andimmunoelectrophoresis assays, etc. See Harlow and Lane (1988) for adescription of immunoassay formats and conditions.

I.E. Protein Binding Assays

[0122] The term “binding” refers to an affinity between two molecules,for example, a ligand and a receptor, means a preferential binding ofone molecule for another in a mixture of molecules. The binding of themolecules can be considered specific if the binding affinity is about1×10⁴ M⁻¹ to about 1×10⁶ M⁻¹ or greater. Binding of two molecules alsoencompasses a quality or state of mutual action such that an activity ofone protein or compound on another protein is inhibitory (in the case ofan antagonist) or enhancing (in the case of an agonist).

[0123] Fluorescence Correlation Spectroscopy (FCS) theory was developedin 1972 but it is only in recent years that the technology to performFCS became available (Madge et al. (1972) Phys Re. Lett 29:705-708;Maiti et al. (1997) Proc Natl Acad Sci USA, 94: 11753-11757). FCSmeasures the average diffusion rate of a fluorescent molecule within asmall sample volume. The sample size can be as low as 10³ fluorescentmolecules and the sample volume as low as the cytoplasm of a singlebacterium. The diffusion rate is a function of the mass of the moleculeand decreases as the mass increases. FCS can therefore be applied toprotein-ligand interaction analysis by measuring the change in mass andtherefore in diffusion rate of a molecule upon binding. In a typicalexperiment, the target to be analyzed is expressed as a recombinantprotein with a sequence tag, such as a poly-histidine sequence, insertedat the N-terminus or C-terminus. The expression takes place in E. coli,yeast or mammalian cells. The protein is purified by chromatography. Forexample, the poly-histidine tag can be used to bind the expressedprotein to a metal chelate column such as Ni2+ chelated on iminodiaceticacid agarose. The protein is then labeled with a fluorescent tag such ascarboxytetramethylrhodamine or BODIPY™ (Molecular Probes, Eugene,Oreg.). The protein is then exposed in solution to the potential ligand,and its diffusion rate is determined by FCS using instrumentationavailable from Carl Zeiss, Inc. (Thornwood, N.Y.). Ligand binding isdetermined by changes in the diffusion rate of the protein.

[0124] Surface-Enhanced Laser Desorption/Ionization (SELDI) was inventedby Hutchens and Yip (1993) Rapid Commun Mass Spectrom 7:576-580). Whencoupled to a time-of-flight mass spectrometer (TOF), SELDI providesmeans to rapidly analyze molecules retained on a chip. It can be appliedto ligand-protein interaction analysis by covalently binding the targetprotein on the chip and analyze by MS the small molecules that bind tothis protein (Worrall et al. (1998) Anal. Biochem. 70: 750-756). In atypical experiment, the target to be analyzed is expressed as describedfor FCS. The purified protein is then used in the assay without furtherpreparation. It is bound to the SELDI chip either by utilizing thepoly-histidine tag or by other interaction such as ion exchange orhydrophobic interaction. The chip thus prepared is then exposed to thepotential ligand via, for example, a delivery system able to pipet theligands in a sequential manner (autosampler). The chip is then submittedto washes of increasing stringency, for example a series of washes withbuffer solutions containing an increasing ionic strength. After eachwash, the bound material is analyzed by submitting the chip toSELDI-TOF. Ligands that specifically bind the target are identified bythe stringency of the wash needed to elute them.

[0125] Biacore relies on changes in the refractive index at the surfacelayer upon binding of a ligand to a protein immobilized on the layer. Inthis system, a collection of small ligands is injected sequentially in a2-5 microliter cell, wherein the protein is immobilized within the cell.Binding is detected by surface plasmon resonance (SPR) by recordinglaser light refracting from the surface. In general, the refractiveindex change for a given change of mass concentration at the surfacelayer is practically the same for all proteins and peptides, allowing asingle method to be applicable for any protein (Liedberg et al. (1983)Sensors Actuators 4:299-304; Malmquist (1993) Nature 361:186-187). In atypical experiment, the target to be analyzed is expressed as describedfor FCS. The purified protein is then used in the assay without furtherpreparation. It is bound to the Biacore chip either by utilizing thepoly-histidine tag or by other interaction such as ion exchange orhydrophobic interaction. The chip thus prepared is then exposed to thepotential ligand via the delivery system incorporated in the instrumentssold by Biacore (Uppsala, Sweden) to pipet the ligands in a sequentialmanner (autosampler). The SPR signal on the chip is recorded and changesin the refractive index indicate an interaction between the immobilizedtarget and the ligand. Analysis of the signal kinetics on rate and offrate allows the discrimination between non-specific and specificinteraction.

I.F. Transgenic Animals

[0126] It is also within the scope of the present invention to prepare atransgenic animal to mutagenize the EP17 locus or to express a transgenecomprising nucleic acid sequences of the present invention. The term“transgenic animal”, indicates an animal comprising a germline insertionof a heterologous nucleic acid. Transgenic animals of the presentinvention are understood to encompass not only the end product of atransformation method, but also transgenic progeny thereof.

[0127] The term “transgene”, as used herein indicates a heterologousnucleic acid molecule that has been transformed into a host cell. Forintended use in the creation of a transgenic animal, the transgeneincludes genomic sequences of the host organism at a selected locus orsite of transgene integration to mediate a homologous recombinationevent. A transgene further comprises nucleic acid sequences of interest,for example a targeted modification of the gene residing within thelocus, a reporter gene, or a expression cassette, each defined hereinabove.

[0128] Transgene integration can be used to create gene mutations,including “knock-out”, “knock-in”, or a “knock-down” mutations. The term“knock-out” refers to a homologous recombination event that renders agene inactive. Gene knock-out is generally accomplished by integrationof the transgene at a chromosomal loci, thereby interrupting a generesiding at that loci. The term “knock-in” refers to in vivo replacementat a targeted locus. Knock-in mutations can modify a gene sequence tocreate a loss-of-function or gain-of-function mutation. The term “geneknock-down” refers to a homologous recombination event wherein thetransgene partially eliminates gene function. A knock-down animal can becreated by transgenic expression of an antisense molecule, wherein atransgene comprising the antisense sequence and a relevant promoter areintegrated into the genome at a non-essential loci. Expression of theantisense or ribozyme molecule disrupts the corresponding gene function,although this disruption is generally incomplete (Luyckx et al. (1999)Proc Natl Acad Sci U S A 96(21):12174-12179).

[0129] Conditional mutation can be accomplished using transgenic methodsin combination with the Cre-recombinase system in mice. Briefly, in oneinstance, a transgenic mouse is derived that expresses Cre-recombinaseunder the direction of an inducible promoter. A second transgenic mousebears a mutation of a gene of interest as well as a lox-P-flankedendogenous gene sequence. Such transgenic mice are mated, the resultingprogeny having both the Cre-recombinase and lox-P-flanked transgenes.Induction of Cre recombinase catalyzes excision of the lox-P-flankedtransgene, thereby excising a portion of the endogenous gene sequenceand revealing the mutated sequence. Conditional knockout can be variedaccording to the temporal and spatial features of Cre recombinaseexpression, inherent in the selection of a promoter to drive Crerecombinase. See Postic et al. (1999) J Biol Chem 275(1):305-315; andSauer (1998) Methods 14(4):381-392.

[0130] Transgenes can also be used for heterologous expression in a hostorganism without generating phenotypically apparent mutations. By thismethod, nucleotide sequences of interest are introduced into the genomeat a nonessential loci, whereby insertion alone does not disrupt anessential gene function.

[0131] Techniques for the preparation of transgenic animals are known inthe art. Exemplary techniques are described in U.S. Pat. No. 5,489,742(transgenic rats); U.S. Pat. Nos. 4,736,866, 5,550,316, 5,614,396,5,625,125 and 5,648,061 (transgenic mice); U.S. Pat. Nos. 5,573,933(transgenic pigs); 5,162,215 (transgenic avian species) and U.S. Pat.No. 5,741,957 (transgenic bovine species). Briefly, nucleotide sequencesof interest are cloned into a vector (e.g., pLNK—Gorman et al. 1996),and the construct is transformed into a germ cell. In the germ cell, achromosomal rearrangement event takes place wherein the nucleic acidsequences of interest are integrated into the genome of the germ cell byhomologous recombination. Fertilization and propagation of thetransformed germ cell results in a transgenic animal. Homozygosity ofthe mutation is accomplished by intercrossing.

I.G. Therapeutic Methods

[0132] The present invention further provides methods for discoveringsubstances that can be used as pharmaceutical compositions. The term“pharmaceutical composition” or “drug” as used herein, each refer to anysubstance having a biological activity. Substances discovered by methodsof the present invention include but are not limited to polypeptide,proteins, peptides, chemical compounds, and antibodies.

[0133] A composition of the present invention is typically formulatedusing acceptable vehicles, adjuvants, and carriers as desired.

[0134] Among the acceptable vehicles and solvents that can be employedare water, Ringer's solution, and isotonic sodium chloride solution. Inaddition, sterile, fixed oils are conventionally employed as a solventor suspending medium. For this purpose any bland fixed oil can beemployed including synthetic mono- or di-glycerides. In addition, fattyacids such as oleic acid find use in the preparation of injectablecompositions.

[0135] Injectable preparations, for example sterile injectable aqueousor oleaginous suspensions, are formulated according to the known artusing suitable dispersing or wetting agents and suspending agents. Thesterile injectable preparation can also be a sterile injectable solutionor suspension in a nontoxic diluent or solvent, for example, as asolution in 1,3-butanediol.

[0136] A vector can be used as a carrier, for example an adenovirusvector, can be used for gene therapy methods. The vector is purified tosufficiently render it essentially free of undesirable contaminants,such as defective interfering adenovirus particles or endotoxins andother pyrogens such that it does not cause any untoward reactions in theindividual receiving the vector construct. A preferred means ofpurifying the vector involves the use of buoyant density gradients, suchas cesium chloride gradient centrifugation.

[0137] A transfected cell can also serve as a carrier. By way ofexample, a liver cell can be removed from an organism, transfected witha nucleic acid sequence of the present invention using methods set forthabove and then the transfected cell returned to the organism (e.g.injected intra-vascularly).

[0138] Monoclonal antibodies or polypeptides of the invention can beadministered parenterally by injection or by gradual infusion over time.Although the tissue to be treated can typically be accessed in the bodyby systemic administration and therefore most often treated byintravenous administration of therapeutic compositions, other tissuesand delivery means are provided where there is a likelihood that thetissue targeted contains the target molecule and are known to those ofskill in the art.

[0139] Representative antibodies for use in the present invention areintact immunoglobulin molecules, substantially intact immunoglobulinmolecules, single chain immunoglobulins or antibodies, those portions ofan immunoglobulin molecule that contain the paratope, including antibodyfragments. It is contemplated to be within the scope of the presentinvention that a monovalent modulator can optionally be used.

[0140] Methods of preparing “humanized” antibodies are generally wellknown in the art, and can readily be applied to the antibodies of thepresent invention. Humanized monoclonal antibodies offer particularadvantages over monoclonal antibodies derived from other mammals,particularly insofar as they can be used therapeutically in humans.Specifically, humanized antibodies are not cleared from the circulationas rapidly as “foreign” antigens, and do not activate the immune systemin the same manner as foreign antigens and foreign antibodies.

[0141] With respect to the therapeutic methods of the present invention,a preferred subject is a vertebrate subject. A preferred vertebrate iswarm-blooded; a preferred warm-blooded vertebrate is a mammal. Apreferred mammal is a mouse or, most preferably, a human. As used hereinand in the claims, the term “patient” includes both human and animalpatients. Thus, veterinary therapeutic uses are provided in accordancewith the present invention.

[0142] Also provided is the treatment of mammals such as humans, as wellas those mammals of importance due to being endangered, such as Siberiantigers; of economical importance, such as animals raised on farms forconsumption by humans; and/or animals of social importance to humans,such as animals kept as pets or in zoos. Examples of such animalsinclude but are not limited to: carnivores such as cats and dogs; swine,including pigs, hogs, and wild boars; ruminants and/or ungulates such ascattle, oxen, sheep, giraffes, deer, goats, bison, and camels; andhorses. Also provided is the treatment of birds, including the treatmentof those kinds of birds that are endangered and/or kept in zoos, as wellas fowl, and more particularly domesticated fowl, i.e., poultry, such asturkeys, chickens, ducks, geese, guinea fowl, and the like, as they arealso of economical importance to humans. Thus, provided is the treatmentof livestock, including, but not limited to, domesticated swine,ruminants, ungulates, horses, poultry, and the like.

[0143] As used herein, the term “experimental subject” refers to anysubject or sample in which the desired measurement is unknown. The term“control subject” refers to any subject or sample in which a desiredmeasure is unknown.

[0144] As used herein, an “effective” dose refers to one a dose(s)administered to an individual patient sufficient to cause a change inEP17 activity. After review of the disclosure herein of the presentinvention, one of ordinary skill in the art can tailor the dosages to anindividual patient, taking into account the particular formulation andmethod of administration to be used with the composition as well aspatient height, weight, severity of symptoms, and stage of thebiological condition to be treated. Such adjustments or variations, aswell as evaluation of when and how to make such adjustments orvariations, are well known to those of ordinary skill in the art ofmedicine.

[0145] A therapeutically effective amount can comprise a range ofamounts. One skilled in the art can readily assess the potency andefficacy of an EP17 modulator of this invention and adjust thetherapeutic regimen accordingly. A modulator of EP17 biological activitycan be evaluated by a variety of means including the use of a responsivereporter gene, interaction of lipocalin polypeptides with a monoclonalantibody, and fertility assays, each technique described herein.

[0146] Additional formulation and dose techniques have been described inthe art, see for example, those described in U.S. Pat. Nos. 5,326,902and 5,234,933, and PCT Publication WO 93/25521.

[0147] For the purposes described above, the identified substances cannormally be administered systemically, parenterally, or orally. The term“parenteral” as used herein includes intravenous, intra-muscular,intra-arterial injection, or infusion techniques. Other compositions foradministration include liquids for external use, and endermic liniments(ointment, etc.), suppositories, and pessaries which comprise one ormore of the active substance(s) and can be prepared by known methods.

II. Cloning and Expression of the Mouse EP17 Gene

[0148] The present co-inventors have identified a new lipocalin, mEP17,that is adjacent to a related lipocalin-encoding gene, mE-RABP, on mousechromosome 2 (FIG. 1). The genomic organization of the mEP17 gene wasdetermined by prediction of exons within a BAC genomic clone and furthersupported by cloning of the mEP17 cDNA (Lareyre et al., 2001). FIG. 1depicts the genomic organization of the mEP17 locus. Exon sizes areindicated in nucleotides. The major transcription initiation sites ofboth genes are represented with broken arrows. Primer FwMEP17cDNA (SEQID NO:7) was used for primer extension analysis, as described hereinbelow. The G-X—W and T-D-Y and motifs and two cysteine residues (C) arealso indicated.

[0149] To isolate the mEP17 promoter region, two clones containing 6.3kb EcoRV restriction fragments within the 5′ flanking region wereisolated from the genomic BAC clone 10983 (FIG. 9). DNA sequencinganalysis of both clones revealed that the 6.3 kb fragment contains 5.4kb of the 5′ flanking region of the mEP17 gene. Cloning methods areprovided in Example 1.

[0150] The tissue distribution of mRNA encoding the mEP17 protein wasexamined by Northern blot analysis of total RNA from twelve differenttissues, including spleen, liver, heart, lung, brain, kidney, testis,epididymis, vas deferens, seminal vesicles, uterus, and ovary (FIG. 2A).Hybridization of Northern blots with a [³²P]-radiolabeled mEP17 cDNAprobe revealed two RNA species of about 3.1 kb and 1 kb only in theepididymis. The total length of the mEP17 gene, including exons andintron, is 3.1 kb. To determine whether the 3.1 kb RNA could be theprecursor RNA, two epididymal RNA samples were run side by side andhybridized individually with the cDNA probe or with a probe encompassingintron 1 of the mEP17 gene (FIG. 2B). The first intron 1 probehybridized with the 3.1 kb RNA but not the 1 kb RNA, indicating that the3.1 kb RNA is an unspliced precursor RNA.

[0151] To investigate the tissue-, region-, and cell-specific profile ofgene expression, in situ hybridization of mEP17 transcripts was carriedout using sense and antisense digoxygenin-labeled riboprobes generatedfrom mEP17 cDNA (FIGS. 3 and 4a), as described in Example 2 below. mEP17mRNA was detected only in the principal cells of the initial segment ofthe epididymis and is localized basally. Hybridization was not detectedwhen the sense riboprobe was used (FIG. 4B). mEP17 gene expression washigh throughout the initial segment (IS). A checkerboard pattern wasobserved at the boundary between the initial segment and the proximalcaput epididymis, wherein some cells expressed mEP17 and some cells didnot. mEP17 mRNA was not detected in the efferent ducts (ED), mid anddistal caput (Cp), corpus and cauda epididymis using sense or antisenseprobes.

III. Isolation of Human EP17

[0152] The present invention also provides a human EP17 gene.Preferably, the human EP17 gene comprises the sequence set forth as SEQID:2, a nucleic acid molecule that is substantially similar to SEQ IDNO:2, or a nucleic acid molecule comprising a 20 base pair nucleotidesequence that is identical to a contiguous 20 base pair sequence of SEQID NO:2.

[0153] The mouse EP17 sequence was used to query databases of humangenomic sequence, including GenBank and proprietary databases of CeleraGenomics Corp. (Rockville, Md.). Two DNA fragments derived from humanchromosome 9 were identified (Accession numbers AL35598.7 and 449425.3)in GenBank, although neither sequence or the combination of thesequences predicts the hEP17 gene. A genomic region having sequencescorresponding to the hEP17 gene was also identified in Celera's database(Accession number GA 65 373998). The genomic sequence derived fromCelera's database was unannotated and did not predict the hEP17 gene.

[0154] The hEP17 gene disclosed herein (SEQ ID NO:2) was predicted bycomparing unannotated genomic sequence and the gene structure of mouseEP17. Conserved intron/exon boundaries and conserved nucleotide sequencewere recognized and used to construct the gene map depicted in FIG. 5and Table 1. Preferably, the human EP17 gene comprises a coding regionand a promoter region set forth as SEQ ID NOs:3 and 5, respectively.TABLE 1 base pairs Feature from to Exon 1  5160  5255 Intron 1  525612495 Exon 2 12496 12626 Intron 2 12627 13072 Exon 3 13073 13143 Intron3 13144 14148 Exon 4 14149 14253 Intron 4 14254 14331 Exon 5 14332 14421Intron 5 14422 14504 Exon 6 14505 14530 Intron 6 14531 15155 Exon 715156 15279

[0155] The predicted hEP17 gene displays sequence homology with otherlipocalins, most notably with m-ERABP and prostaglandin H₂-D isomerasess (FIG. 6). The mouse and human EP17 proteins share 61% overall identityand have conserved lipocalin motifs (G-X—W, T-D-Y, and two cysteineresidues) at similar positions (FIG. 7).

[0156] The amino terminal regions of both mouse and human EP17 proteinsare predicted to be a signal peptide since in each case the region ishighly hydrophobic, similar to the signal peptide of the mE-RABPprotein, and in agreement with the sliding window/matrix scoring methodand -1,-3 rule for predicting a peptide cleavage (von Heiji (1986)Nucleic Acid Res 14:4683-4690) (FIG. 8). This observation implies thatthe human and mouse EP17 genes encode secreted proteins.

IV. Characterization of an EP17 Promoter Region

[0157] The transcription initiation sites of the mEP17 gene weredetermined by primer extension using epididymal total RNA as a templateand the EP17PE2 primer (SEQ ID NO:7) designed according to sequence inthe first exon (FIG. 10). Primer extension methods are described inExample 4 below. Two major transcription initiation sites were localized22 and 18 nucleotides from the putative translation initiation site, andwere numbered +1 and +5, respectively. Two minor transcriptioninitiation sites were also detected at position +2 and +4. As shown inFIG. 10, total RNA extracted from the epididymis (Ep) or transfer (t)RNA was reverse transcribed with [³²P]-radiolabeled EP17PE2 primer (SEQID NO:7) and extended using Avian Myeloblastosis Virus (AMV) reversetranscriptase. Lanes labeled “C”, “T”, “A” and “G” are[³⁵S]-radiolabeled DNA sequencing reactions carried out using theEP17PE2 primer (SEQ ID NO:7) and the pHindIII clone (shown in FIG. 9) astemplate. The localization of two major (arrows) and two minor(arrowheads) transcription initiation sites are indicated.

[0158] The 5′ flanking sequence closest to the transcription start sitewas analyzed further. A 2.5-kb EcoRI restriction fragment comprisingthis sequence was isolated from the genomic BAC clone 10983 (shown inFIG. 9). A computer analysis to identify putative cis-regulatory siteswas carried out using TFSEARCH version 1.3 (Yutaka Akiyama: “TFSEARCH:Searching Transcription Factor Binding Sites”,http://www.rwcp.or.jp/papia/). This analysis revealed the presence ofseveral binding sites for known transcription factors, including bindingsites for androgen receptor (ARSB), retinoic acid receptor (RARE),Stimulating Protein 1 (SP-1), Activator Protein 1 (AP-1), ActivatorProtein 4 (AP4), SRY (Sex-determining Region Y protein), C-Ets (cellularets oncogene), C/EBP (CCMT/enhancer binding protein), and Sox-5(SRY-related sequence #5 protein). Putative cis-regulatory sites areunderlined in FIG. 11. A consensus TATA box and CMT-box are indicated.Major transcription initiation sites are marked by long arrows, andminor transcription initiation sites are marked by arrowheads. Thecomputer analysis was carried out using TFSEARCH version 1.3 [YukataAkiyama: “TFSEARCH: Searching Transcription Factor Binding Sites”,http://www.rwcp.or.jp/papia/].

[0159] To define functional sequences within the promoter region,several chimeric reporter genes were constructed by ligation of variousportions of the 5′ flanking region of the EP17 gene and a reporter gene(FIG. 12), as described in Example 5 below. Each chimeric gene comprisesa different fragment of the 5.3 kb EP17 promoter region, an open readingframe encoding a reporter gene operably linked to a promoter fragment,and the polyA tail region of Simian Virus 40 large T antigen. FIG. 12indicates one preferred reporter, chloramphenicol acetyltransferase(CAT). These constructs are alternatively used for in vitro and in vivoassays of EP17 promoter region function.

[0160] A preferable in vitro technique for evaluating EP17 promoterfunction is a transient transfection assay. According to this method,each chimeric reporter gene is introduced into a relevant host cell, andthe resulting level of reporter gene expression is quantitated.Preferred host cells include HeLa and PC-3 cells, or normal orimmortalized epididymal cells, described herein below. In theseexperiments, luciferase is a preferable reporter gene in that itdemonstrates increased sensitivity of detection. Transient transfectionassays are performed as described in Example 6. Additional methods formaking an expression system comprising a promoter region operably linkedto a heterologous reporter sequence are disclosed in U.S. Pat. No.6,087,111.

[0161] To analyze the function of an EP17 promoter region in vivo,transgenic mice bearing each chimeric gene are generated as described inExample 7 below, and a level of reporter gene expression in each mouseis determined. For these experiments, CAT is a preferred reporter geneas it displays low endogenous activity in the epididymis. Several assaysare performed to characterize CAT expression in transgenic animals,including PCR using CAT-specific primers, CAT enzymatic assays,immunohistochemistry using an antiCAT antibody, and in situhybridization using a CAT-specific probe. Methods for performing theseassays can be found in Lareyre, J. J., et al. (1999) J. Biol. Chem.274:8282-8290, in Lareyre et al. (2001) and Examples 2, 8, and 9.

[0162] A transgenic mouse bearing the entire 5.3 kb 5′ flanking regionof the EP17 gene operably linked to the CAT gene shows CAT expression inthe caput epididymis, demonstrating that the 5.3 kb promoter region ofthe EP17 gene contains sequences required for the region-specificexpression of the EP17 gene. Shorter sequences of the EP17 promoterregion can be used to define a minimal sequence requisite for EP17 geneexpression. In determining a promoter region that reproduces endogenousEP17 expression, the expression profile of each chimeric gene can becarefully compared to the profile of EP17 gene expression as determinedby in situ hybridization.

[0163] Within a candidate promoter region or response element, thepresence of regulatory proteins bound to a nucleic acid sequence can bedetected using a variety of methods well known to those skilled in theart (Ausubel et al., 1992). Briefly, in vivo footprinting assaysdemonstrate protection of DNA sequences from chemical and enzymaticmodification within living or permeabilized cells. Similarly, in vitrofootprinting assays show protection of DNA sequences from chemical orenzymatic modification using protein extracts. Nitrocellulosefilter-binding assays and gel electrophoresis mobility shift assays(EMSAs) track the presence of radiolabeled regulatory DNA elements basedon provision of candidate transcription factors.

[0164] Genomic clones derived from GenBank and proprietary databases(Celera Genomics Corp., Rockville, Md.) were used to predict an hEP17promoter region comprising an about 5150 base pair region immediatelyupstream of the hEP17 transcription start site (FIG. 13). This region issimilar to the promoter region of mEP17, having putative cis-DNAregulatory elements included but not limited to a Sp-1 binding site, anAP-1 binding site, a cAMP response element binding protein (CREB)binding site, a SRY-related HMG box gene 5 (Sox 5) binding site, aSex-determining region Y gene product (SRY) binding site, a c-Etsbinding site, a GATA binding site, and an Octamer trasncription factor 1(Oct-1) binding site (FIG. 14). The hEP17 promoter is furthercharacterized in a manner as described herein above regarding the mouseEP17 promoter region.

V. Methods for Identifying Regulators of Gene Expression

[0165] The nucleic acid sequences of the present invention can be usedto identify regulators of EP17 gene expression. Several molecularcloning strategies can be used to identify substances that specificallybind EP17 cis-regulatory elements. A preferred promoter region to beused in such assays is an EP17 promoter region from mouse or human, morepreferably the promoter region includes some or all amino-acids of SEQID NOs:1 or 5. FIGS. 15A-15C presents data mEP17 expression is notregulated by hormones. However, studies in which spermatogenesis wasdisrupted suggest that an EP17 lipocalin can be regulated by germcell-associated factors.

[0166] In one embodiment, a cDNA library in an express,ion vector, suchas the lambda-gt11 vector, can be screened for cDNA clones that encodean EP17 regulatory element DNA-binding activity by probing the librarywith a labeled EP17 DNA fragment, or synthetic oligonucleotide (Singh etal. (1989) Biotechniques 7:252-261). Preferably the nucleotide sequenceselected as a probe has already been demonstrated as a protein bindingsite using a protein-DNA binding assay described above.

[0167] In another embodiment, transcriptional regulatory proteins areidentified using the yeast one-hybrid system (Luo et al. (1996)Biotechniques 20(4):564-568; Vidal et al. (1996) Proc Natl Acad Sci USA93(19):10315-10320; Li and Herskowitz (1993) Science 262:1870-1874). Inthis case, a cis-regulatory element of an EP17 gene is operably fused asan upstream activating sequence (UAS) to one, or typically more, yeastreporter genes such as the lacZ gene, the URA3 gene, the LEU2 gene, theHIS3 gene; or the LYS2 gene, and the reporter gene fusion construct(s)is inserted into an appropriate yeast host strain. It is expected thatthe reporter genes are not transcriptionally active in the engineeredyeast host strain, for lack of a transcriptional activator protein tobind the UAS derived from the EP17 promoter region. The engineered yeasthost strain is transformed with a library of cDNAs inserted in a yeastactivation domain fusion protein expression vector, e.g. pGAD, where thecoding regions of the cDNA inserts are fused to a functional yeastactivation domain coding segment, such as those derived from the GAL4 orVP16 activators. Transformed yeast cells that acquire a cDNA encoding aprotein that binds a cis-regulatory element of an EP17 gene can beidentified based on the concerted activation of the reporter genes,either by genetic selection for prototrophy (e.g. LEU2, HIS3, or LYS2reporters) or by screening with chromogenic substrates (lacZ reporter)by methods known in the art.

[0168] In another embodiment, an in situ filter detection method is usedto clone a cDNA encoding the sequence-specific DNA-binding protein asdescribed in Example 10.

[0169] In a more preferred embodiment, one-hybrid analysis and in situfilter detection methods are used sequentially. For example, an initialcollection of candidate transcription factors is identified byone-hybrid analysis, and this initial collection is secondarily screenedusing in situ filter detection. This combination of techniques providesa smaller but more confident pool of candidate regulators than selectedby either technique alone.

[0170] A candidate regulator to be tested by these methods can be apurified molecule, a homogenous sample, or a mixture of molecules orcompounds. More than one modulatable transcriptional regulatory sequencecan be screened simultaneously.

[0171] In accordance with the present invention there is also provided arapid and high throughput screening method that relies on the methodsdescribed above. This screening method comprises separately contactingeach compound with a plurality of substantially identical samples. Insuch a screening method the plurality of samples preferably comprisesmore than about 10⁴ samples, or more preferably comprises more thanabout 5×10⁴ samples. In an alternative high-throughput strategy, eachsample can be contacted with a plurality of candidate compounds.

[0172] The present invention also provides an in vivo assay fordiscovery of modulators of EP17 expression. In this case, a transgenicmouse is made such that a transgene comprising an EP17 promoter and areporter gene is expressed and a level of reporter gene expression isassayable. Such transgenic animals can be used for the identification ofdrugs, pharmaceuticals, therapies, and interventions that are effectivein modulating EP17 expression.

VI. Method for Heterologous Gene Expression in Epididymis

[0173] The present invention enables epididymal expression of aheterologous nucleic acid sequence. In this case, a transgenic animal isgenerated which bears a transgene that includes an EP17 promoter regionand a nucleotide sequence of interest. A preferred EP17 promoter is thenucleotide sequence of SEQ ID NO:1 or 5, more preferably a minimalfunctional portion of SEQ ID NO:1 or 5 that drives appropriateepididymal expression, as determined by methods described herein above.

[0174] In one embodiment, this method enables assay of the function of agene of interest in epididymis to the exclusion of other sites of genefunction. For example, the heterologous sequence can encode an antisenseor ribozyme nucleic acid molecule. When expressed in epididymis underthe direction of an EP17 promoter region, the function of a genecorresponding to the antisense or ribozyme nucleic acid molecule isdisrupted in epididymis but not other tissues.

[0175] In another embodiment, an EP17 promoter drives expression of atoxin, for example, thymidine kinase plus ganciclovir. Expression of thechimeric gene targets degeneration of the initial segment of theepididymis. In this case, the transgenic animal can be used as animalmodel of infertility, described further herein below.

[0176] In another embodiment, an EP17 promoter region drives expressionof a therapeutic gene or nucleotide sequence, as described herein below.

VII. Method for Producing an Epididymal Cell Line

[0177] Another aspect of the invention is a method for producing anepididymal cell line. According to this method, a chimeric gene isconstructed to express a gene encoding a selectable marker under thecontrol of an EP17 promoter region, and the chimeric gene is used tocreate a transgenic animal expressing the selectable marker inepithelial cells of the initial segment of the epididymis. Preferably,the selectable marker confers antibiotic resistance, and morepreferably, the selectable marker confers neomycin resistance, which canbe used even in selection of epididymal cells from non-epididymal cellsin culture. Also preferably, the EP17 promoter region used to performthis method is the sequence of SEQ ID NO:1, or functional portionthereof. Similarly, using the mE-RABP promoter, a neomycin-resistantimmortalized cell line from the distal caput can be generated by thismethod.

[0178] Also provided is a method for generating an immortalizedepididymal cell line. In this case, a transgenic animal is obtained,having a transgene that encodes an oncogenic virus directed by aconstitutive promoter. A preferred oncogenic virus comprises atemperature-sensitive (ts) Simian virus 40 large T antigen (Tegtmeyer(1975) J Virol 15(3):613-618). The ts-Simian virus 40 large T-antigen iscompletely inactive at non-permissive temperature (39° C.), partiallyinactive at body temperature, and substantially active at a permissivetemperature (33° C.). Immortalized epithelial cells are procured fromts-Simian virus 40 large T-antigen mouse are reproduced in culture. Inone embodiment, epididymal cells may be selected using the EP17 promoteroperably linked to the neomycin resistant gene. Since the EP17 promoteris expressed in the initial segment, neo-selection will provide a purepopulation of epithelial cells from that segment. A neomycin-resistantimmortalized cell line from the distal caput has been generated by thismethod using the E-RABP promoter and maintained in culture for 12months.

VIII. Method for Homologous Recombination at the EP17 Locus

[0179] Another aspect of the invention is a method for mutagenizing theEP17 locus by homologous recombination. The method uses a targetingvector having an isolated EP17 promoter region, a marker gene, and anisolated EP17 3′ flanking region. In a vector so constructed, the markergene is positioned between the promoter region and the 3′ flankingregion. In another embodiment, the targeting vector further comprises amutant EP17 coding sequence, also positioned between the promoter regionand the 3′ flanking region. The targeting vector is linearized bydigestion with a restriction endonuclease at a site other than withinthe promoter region, marker gene, 3′ flanking region, and optionalmutant EP17 coding sequence.

[0180] In a preferred embodiment, the linearized vector iselectroporated into embryonic stem cells, and successful electroporationis assayed by detecting the marker gene in the stem cells. Stem cellsbearing the vector are used to create a transgenic animal. According tothe method, a homologous recombination event is mediated at the EP17locus, thereby exchanging native EP17 gene sequences positioned betweenthe promoter region and the 3′ flanking region with vector nucleotidesequences positioned the same.

[0181] The nucleic acids and methods of the present invention enableknockout, knock-in, and knock-down mutations of the EP17 gene. Thephenotype of EP17 mutant animals can be characterized to reveal EP17function. The expression of EP17 in epididymis, and the known functionalimportance of regulated epididymal gene expression for male fertility,suggest that EP17 will have a determinative role in sperm maturation.Methods for generating MEP17 mutant mice are provided in Example 11.

[0182] A preferred knock-out mutation removes part of the EP17 codingregion (exon1) and can be generated using a targeting vector as depictedin FIG. 16. Preferred knock-in mutations include mutation of any one ofamino acids within the conserved lipocalin motifs to any amino acid thatis non-conservative substitution. Other preferred knock-in mutations aretargeted replacement of one or both of the conserved cysteine residueswith an amino acid(s) that is a non-conservative substitution.

IX. Method for Detecting a EP17 Nucleic Acid Molecule

[0183] In another aspect of the invention, a method is provided fordetecting a nucleic acid molecule that encodes an EP17 polypeptide.According to the method, a biological sample having nucleic acidmaterial is procured and hybridized under stringent hybridizationconditions to an EP17 nucleic acid molecule of the present invention.Such hybridization enables a nucleic acid molecule of the biologicalsample and the EP17 nucleic acid molecule to form a detectable duplexstructure. Preferably, the EP17 nucleic acid molecule includes some orall nucleotides of SEQ ID NO:1, 2, 3, or 5. also preferably, thebiological sample comprises human nucleic acid material.

[0184] In another embodiment, genetic assays based on nucleic acidmolecules of the present invention can be used to screen for geneticvariants by a number of PCR-based techniques, including single-strandconformation polymorphism (SSCP) analysis (Orita, M., et al. (1989) ProcNatl Acad Sci USA 86(8):2766-2770), SSCP/heteroduplex analysis, enzymemismatch cleavage, and direct sequence analysis of amplified exons(Kestila et al. (1998) Mol Cell 1 (4):575-582; Yuan et al. (1999) HumMutat 14(5):440-446). Automated methods can also be applied tolarge-scale characterization of single nucleotide. polymorphisms(Brookes (1999) Gene 234(2):177-186; Wang et al. (1998) Science280(5366):1077-82). The present invention further provides assays todetect a mutation of a variant EP17 locus by methods such asallele-specific hybridization (Stoneking et al. (1991) Am J Hum Genet48(2):370-82), or restriction analysis of amplified genomic DNAcontaining the specific mutation.

X. Recombinant Production of an EP17 Polypeptide

[0185] The present invention also provides a method for recombinantproduction of a EP17 polypeptide, as described in Example 12.Preferably, the recombinant polypeptide comprises some or all of theamino acid sequences of SEQ ID NO:4 or 6.

[0186] Mouse EP17 protein was recombinantly produced using the pBAD/gIIIvector (Invitrogen of Carlsbad, Calif.). To confirm the production ofEP17 protein, total protein derived from transformed E.coli was resolvedon a polyacrylamide gel, and Coomassie blue staining revealed twoenriched bands of approximately 21 kDa and 23 kDa. Western blot analysisusing an anti-his tag antibody revealed the same two proteins, whichcorrespond to the processed and unprocessed EP17 isoforms, respectively(FIG. 17)

[0187] Recombinantly produced proteins are useful for a variety ofpurposes, including structural determination of an EP17 polypeptide,generation of an antibody that recognizes an EP17 polypeptide, andscreening assays to identify a chemical compound or peptide thatinteracts with an EP17 polypeptide, described further herein below.

XI. Production of EP17 Antibodies

[0188] In another aspect, the present invention provides a method ofproducing an antibody immunoreactive with a lipocalin polypeptide, themethod comprising recombinantly or synthetically producing an EP17polypeptide, or portion thereof, to be used as an antigen. The EP17polypeptide is formulated so that it is used as an effective immunogen.An animal is immunized with the formulated EP17 polypeptide, generatingan immune response in the animal. The immune response is characterizedby the production of antibodies that can be collected from the bloodserum of the animal. Preferred embodiments of the method use apolypeptide as of SEQ ID NO:4 or 6.

[0189] The present invention also encompasses antibodies produced bythis method.

[0190] The foregoing antibodies can be used in methods known in the artrelating to the localization and activity of the EP17 polypeptidesequences of the invention, e.g., for cloning of EP17 nucleic acids,immunopurification of EP17 polypeptides, imaging EP17 polypeptides in abiological sample, measuring levels thereof in appropriate biologicalsamples, and in diagnostic methods.

XII. Method for Detecting an EP17 Polypeptide

[0191] In another aspect of the invention, a method is provided fordetecting a level of EP17 polypeptide using an antibody thatspecifically recognizes an EP17 polypeptide, or portion thereof. In apreferred embodiment, biological samples from an experimental subjectand a control subject are obtained, and EP17 polypeptide is detected ineach sample by immunochemical reaction with the EP17 antibody. Morepreferably, the antibody recognizes amino acids of SEQ ID NO:4 or 6 andis prepared according to a method of the present invention for producingsuch an antibody.

[0192] In one embodiment, an EP17 antibody is used to screen abiological sample for the presence of a lipocalin polypeptide. Abiological sample to be screened can be a biological fluid such asextracellular or intracellular fluid, or a cell or tissue extract orhomogenate. A biological sample can also be an isolated cell (e.g., inculture) or a collection of cells such as in a tissue sample orhistology sample. A tissue sample can be suspended in a liquid medium orfixed onto a solid support such as a microscope slide. In accordancewith a screening assay method, a biological sample is exposed to anantibody immunoreactive with an EP17 polypeptide whose presence is beingassayed, and the formation of antibody-polypeptide complexes isdetected. Techniques for detecting such antibody-antigen conjugates orcomplexes are well known in the art and include but are not limited tocentrifugation, affinity chromatography and the like, and binding of alabeled secondary antibody to the antibody-candidate receptor complex.

XIII. Screening for Small Molecule Ligands that Interact with EP17

[0193] The present invention further discloses a method for identifyinga compound that modulates EP17 function. According to the method, anEP17 polypeptide is exposed to a plurality of compounds, and binding ofa compound to the isolated EP17 polypeptide is assayed. A compound isselected that demonstrates specific binding to the isolated EP17polypeptide. Preferably, the EP17 polypeptide used in the binding assayof the method includes some or all amino acids of SEQ ID NO:4 or 6.

[0194] Several techniques can be used to detect interactions between aprotein and a chemical ligand without employing an in vivo ligand.Representative methods include, but are not limited to, fluorescencecorrelation spectroscopy, surface-enhanced laser desorption/ionization,and biacore technology, as described in Example 13. These methods areamenable to automated; high-throughput screening.

[0195] Candidate regulators include but are not limited to proteins,peptides, and chemical compounds. Structural analysis of theseselectants can provide information about ligand-target moleculeinteractions that enable the development of pharmaceuticals based onthese lead structures.

[0196] Similarly, the knowledge of the structure a native EP17polypeptide provides an approach for rational drug design. The structureof an EP17 polypeptide can be determined by X-ray crystallography or bycomputational algorithms that generate three-dimensionalrepresentations. See Huang et al. (2000) Pac Symp Biocomput 230-41; Saqiet al. (1999) Bioinformatics 15:521-522. Computer models can furtherpredict binding of a protein structure to various substrate molecules,that can be synthesized and tested. Additional drug design techniquesare described in U.S. Pat. Nos. 5,834,228 and 5,872,011.

XIV. Animal Model of Male Infertility

[0197] The present invention further pertains to an animal model of maleinfertility. Such a model is prepared by several methods.

[0198] Using a transgenic approach, knock-out, knock-in, or knock-downmutation of the EP17 gene can suppress fertility. In another embodiment,expression of a toxin, for example, thymidine kinase plus ganciclovir,under the direction of an EP17 promoter targets degeneration of theinitial segment of the epididymis and thereby compromises fertility.

[0199] The present invention also teaches that an animal model offertility is prepared by immunizing an animal with an EP17 polypeptide.The resulting immune response in the animal comprises a production ofantibodies that specifically bind an EP17 polypeptide, therebydisrupting its biological activity.

[0200] A method is also provided for generating an animal model ofinfertility by administering to an animal a compound that disrupts EP17expression or function. Such a compound is discovered by methodsdisclosed herein.

[0201] Animal models of male infertility can be characterized accordingto several measures, including in vivo and in vitro assays of fertility,as described in Examples 14 and 15 below, and morphological inspectionof the epididymis.

XV. Therapeutic Applications

[0202] Another aspect of the present invention is a therapeutic methodcomprising administering to a subject a substance that modulateslipocalin biological activity. Therapeutic substances include but arenot limited to chemical compounds, antibodies, and gene therapy vectors.

[0203] Compounds that are discovered by the methods disclosed herein isuseful for therapeutic applications related to male fertility. Forexample, a compound that mimics EP17 function, when administered to aninfertile male subject, can regulate fertility by promoting spermatozoamaturation in the epididymis. Conversely, a compound that interfereswith EP17 function can act to suppress spermatozoa maturation whenadministered to a fertile subject.

[0204] The present invention also provides a method for disrupting EP17function by immunizing a subject with an effective dose of the disclosedEP17 polypeptide. The immune system of the subject produces an antibodythat specifically recognizes the EP17 polypeptide, and binding of theantibody to the EP17 polypeptide abolishes EP17 function. In a preferredembodiment, the antibody recognizes some or all of the amino acids ofSEQ ID NO:4 or 6 and is prepared according to a method of the presentinvention for producing such an antibody.

[0205] Several studies have demonstrated the utility ofimmunotherapeutic approaches to contraception and teach methods forpreparing and administering such vaccines, including U.S. Pat. Nos.6,132,720 and 6,096,318, Feng et al. (1999), and Naz (1999). U.S. Pat.No. 6,096,318 additionally discloses methods for chemical modificationof immunogenic proteins, and fragments thereof, which elicit anamplified immune response in a subject receiving an injection of themodified polypeptide. Briefly, the antigen modification is accomplishedby attaching the protein to a carrier such as a bacterial toxin or bypolymerization of protein fragments. This method has been used to modifyhuman chorionic gonadotropin, an antigen that is effective forimmunological contraception in mammals.

[0206] The present invention further provides lipocalin nucleic acidsequences and gene therapy methods for modulating lipocalin activity ina target cell. The gene therapy vector can encode an EP17 lipocalin,preferably comprising the amino acid sequences of SEQ ID NO:4 or 6.Alternatively, a gene therapy vector can include sequences encoding anucleic acid molecule, peptide, or protein that interacts with an EP17lipocalin. This modulation can affect spermatozoa maturation in thevicinity of a lipocalin-secreting cell. Additionally, a gene therapyvector can include an EP17 promoter sequence of the present invention toprovide tissue specific expression of a gene of interest in a subject.Preferably, the EP17 promoter regions used to perform this method is thenucleotide sequence of SEQ ID NO:1 or 5, or functional portion thereof.

[0207] Vehicles for delivery of a gene therapy vector include but arenot limited to a liposome, a cell, and a virus. Preferably, a cell istransformed or transfected with the DNA molecule or is derived from sucha transformed or transfected cell. An exemplary and preferredtransformed or transfected cell is a epididymal cell. Alternatively, thevehicle is a virus, including a retroviral vector, adenoviral vector orvaccinia virus whose genome has been manipulated in alternative ways soas to render the virus non-pathogenic. Methods for creating such a viralmutation are detailed in U.S. Pat. No. 4,769,331. Exemplary gene therapymethods are described in U.S. Pat. Nos. 5,279,833; 5,286,634; 5,399,346;5,646,008; 5,651,964; 5,641,484; and 5,643,567.

[0208] The ability for adenovirus gene therapy vectors to infect malegerm cells to the exclusion of embryos fertilized by infected sperm wasdemonstrated by Hall et al. (2000) Hum Gene Ther 11(12):1705-1712. Hightiters of the vector were injected directly into mouse testis andepididymis, or alternatively, sperm were exposed to the virusimmediately prior to or during in vitro fertilization. The vectorcarried the bacterial lacZ gene under the direction of the Rous sarcomavirus promoter, and infection was assayed by enzymatic or immunologicdetection of β-galactosidase. lacZ expression was assayed during theseveral weeks following injection, and in preimplantation embryosproduced by in vitro fertilization with sperm exposed to gene therapyvector. lacZ expression was observed in sperm but not in embryos,supporting a conclusion that adenovirus vectors pose minimal risk forgerm line integration when exposed to male reproductive cells. Thesestudies teach methods for construction of a gene therapy vector andeffective administration of a vector in the male reproductive system, asproposed for administration of nucleic acids of the present invention.

[0209] The invention further provides a method for diminishing thefertile capacity of a subject. According to the method, a chemicalcompound, peptide, or antibody that interacts with an EP17 polypeptide,preferably the polypeptide of SEQ ID NO:4 or 6, is identified. Apharmaceutical preparation is prepared comprising such a chemicalcompound, peptide, or antibody, and a carrier. An effective dose of thepharmaceutical composition is administered to a subject, whereby thefertile capacity of the subject is diminished.

[0210] The invention further provides a method for promoting the fertilecapacity of a subject. In this case, a chemical compound or peptide thatinteracts with an EP17 polypeptide, preferably the polypeptide of SEQ IDNO:4 or 6, is identified. A pharmaceutical composition comprising thechemical compound or peptide and a carrier is prepared. An effectivedose of the pharmaceutical composition is administered to a subject,whereby the fertile capacity of the subject is improved.

XVI. Summary

[0211] Summarily, the provisions of an EP17 promoter region, a 3′flanking genomic region of mEP17, a coding sequence of hEP17, and ahEP17 polypeptide are a significant advance in fertility-relatedresearch. The disclosed EP17 nucleic acids and polypeptides can be usedaccording to methods of the present invention to generate a mouse modelof male infertility, for drug discovery screens, and for therapeutictreatment of fertility-related conditions.

EXAMPLES

[0212] The following Examples have been included to illustrate modes ofthe invention. Certain aspects of the following Examples are describedin terms of techniques and procedures found or contemplated by thepresent co-inventors to work well in the practice of the invention.These Examples illustrate standard laboratory practices of theco-inventors. In light of the present disclosure and the general levelof skill in the art, those of skill will appreciate that the followingExamples are intended to be exemplary only and that numerous changes,modification and alteration can be employed without departing from thescope of the invention.

Example 1 Cloning of Mouse EP17 Promoter Region

[0213] Two clones containing 6.3 kb EcoRV restriction fragments wereisolated from the genomic BAC clone 10983 (FIG. 9). The DNA fragmentswere subcloned in pBjuescript SK+ (Promega, Madison, Wis.) usingappropriate enzymes. DNA templates for sequencing were purified usingthe “Plasmid Midi kit” (Qiagen Inc., Santa Clarita, Calif.). Sequencingreactions were performed as described in the Thermo Sequenase™fluorescent labeled primer cycle sequencing kit (Perkin-Elmer, FosterCity, Calif.). DNA fragments were separated in a denaturing PAGE (6%acrylamide gel) and analyzed using an ABI 373A automated sequencer (PE,Applied Biosystems, Foster City, Calif.). Nucleotide sequences wereanalyzed using the GeneJockey™ software available from Biosoft ofFerguson, Mo. DNA sequencing analysis of both clones revealed that the6.3 kb fragment contains 5.4 kb 5′ flanking region of the mEP17 gene.

Example 2 In Situ Hybridization

[0214] Nonisotopic in situ hybridization was performed on 4-6 μm thickcryosections of fresh-frozen mouse epididymis. Sections were fixed in 4%formaldehyde in 0.1 M sodium phosphate buffer pH 7.2 and then incubatedfor 10 minutes in PBS containing 5 μg/ml proteinase K. See Sambrook etal. (1992) for a description of PBS. After two rinses in PBS, sectionswere incubated in 0.25% acetic anhydride in 0.1 M triethanolamine pH 8.0for 15 minutes. Sense and antisense riboprobes were prepared in 20 μltranscription reactions containing SP6 (Promega, Madison, Wis.) or T7(New England Biolabs, Beverly, Mass.) polymerase, 1× transcriptionbuffer, 1 mM each of ATP, CTP, and GTP, 0.65 mM UTP, 0.35 mMdigoxygenin-UTP (Roche Diagnostics Corp, Indianapolis, Ind.), and 1 μglinearized F3 plasmid carrying the mEP17 cDNA. Unincorporatednucleotides were removed on a Chroma Spin-100 STE column (Clontech, PaloAlto, Calif.).

[0215] Labeled riboprobes were denatured for 5 minutes at 80° C.,diluted in hybridization buffer composed of 50% (vol/vol) formamide, 10%(wt/vol)dextran sulfate, 4×SSC, 1× Denhardt's reagent, 0.5 mg/ml yeasttRNA, and incubated with the sections overnight at 55° C. See Sambrooket al., 1992 for a description of SSC buffer. The slides were washed atroom temperature for 5 minutes in 2×SSC, rinsed in STE buffer (500 mMNaCl, 20 mM Tris-HCl pH 7.5, 1 mM EDTA), and then incubated for 30minutes in STE containing 40 μg/ml RNase A. The sections were washedsequentially for 5 minutes each in 2×SSC, 50% formamide at 50° C., thenat room temperature with 1×SSC, and finally with 0.5×SSC.

[0216] To detect hybridized probes, slides were rinsed in TN buffer (100mM Tris-HCl pH 7.5, 150 mM NaCl), blocked for 1 hour in blockingsolution (TN buffer containing 2% horse serum and 0.1% Triton X-100),and incubated for 1 hour in 1:500 diluted alkaline phosphataseconjugated antidigoxygenin (Roche Diagnostics Corp.) in blockingsolution. Slides were rinsed three times in blocking solution and thenin a substrate buffer of 100 mM This-HCl pH 9.5, 100 mM NaCl, 50 mMMgCl₂. Color development was in substrate buffer containing 0.17 mM5-bromo4-chloro-3-indolyl phosphate, 10 mM N-ethyl-maleimide, and 1 mMlevamisole as an inhibitor of endogenous alkaline phosphatase. Colordevelopment was stopped with 10 mM This-HCl pH 8.0 and 1 mM EDTA.Sections were examined and photographed with a Zeiss Axiophot using bothbright field and phase contrast optics.

Example 3 Cloning Human EP17

[0217] To recover a full-length human EP17 gene, a hybridization screenis performed using a human epididymal genomic library probed with thenucleotide sequence of SEQ ID NO:2, 3 or 5, or portion thereof. Positivecolonies are selected, a subset sequenced, and a clone corresponding tothe full-length cDNA is recovered. Alternatively, primers from thepredicted 5′ and 3′ ends of SEQ ID NO:2 are used in polymerase chainreaction with a human epididymal genomic DNA as template to amplify afragment representing the full-length clone.

[0218] To recover a full-length human EP17 cDNA, a hybridization screenis performed using a human epididymal cDNA library probed with thenucleotide sequence of SEQ ID NO:2 or 3, or portion thereof. Positivecolonies are selected, a subset sequenced, and a clone corresponding tothe full-length cDNA is recovered. Alternatively, primers from thepredicted 5′ and 3′ ends of SEQ ID NO:3 are used in polymerase chainreaction with a human epididymal cDNA as template to amplify a fragmentrepresenting the full-length clone.

Example 4 Primer Extension Analysis to Identify Transcription InitiationSites

[0219] Total RNA was extracted from the mouse epididymis using a methoddescribed previously (Chomczynski and Sacchi (1987) Anal Biochem162:156-159). The EP17PE2 primer (SEQ ID NO:7) specific for mEP17 mRNAwas radiolabeled using T4 nucleic acid sequence kinase in the presenceof 100 μCi [γ-³²P]-ATP (3000 Ci/mmol) (Amersham) according to themanufacturer's instructions (New England Biolabs). For each reaction, 10μg of epididymal total RNA or transfer RNA was hybridized to 1 pmol (10⁵dpm) of EP17PE2 primer for 12 hours at 35° C. in 10 μl of a solutioncontaining 0.04 M [1,4]-piperazine diethanesulfonic acid (PIPES), pH6.4, 1 μM EDTA, and 80% (vol/vol) formamide.

[0220] Reverse transcription was performed in 20 μl containing 50 μMTris-HCl pH 8.3, 30 μM KCl, 8 μM MgCl₂, 6 μM DTT, 0.5 mM of each dNTP,and 50 units Avian Myeloblastosis Virus (AMV) reverse transcriptase(Promega, Madison, Wis.). Samples were incubated for 30 minutes at 42°C., and then, 50 units of AMV reverse transcriptase were added again andincubated for 1 hour more. Elongated radiolabeled fragments were loadedon a denaturing PAGE (7% polyacrylamide gel) nextto sequencing reactionscarried out using the Sequenase sequencing kit (Amersham, USB). Theclone pHindIII (shown in FIG. 9) and the EP17PE2 primer (SEQ ID NO:7)were used as template and primer, respectively.

Example 5 Chimeric Gene Expression Constructs

[0221] DNA fragments derived from the BAC clone 10983 were generatedusing appropriate restriction enzymes. DNA fragments were resolved on anagarose gel, purified from the agarose, and ligated into thepromoterless pBLCAT3 plasmid (Luckow, 1987) by standard methods. Thisconstruction enabled expression of the CAT gene by a mEP17 promoterregion fragment.

Example 6 Transient Transfection Assays

[0222] Unless otherwise indicated, all media, cell culture andtransfection reagents were obtained from GIBCO BRL (Life Technologies,Rockville, Md.). PC-3 and HeLa cells were cultured in F12K NutrientMixture (Kaighn's Modification) or Dulbecco's Modified Eagle Medium(DMEM) supplemented with 50 units/ml penicillin, 50 μg/ml streptomycinand 10% (v/v) charcoal/dextran treated fetal bovine serum (FBS, Hyclone,Logan, Utah). Both cultures were maintained at 37° C. in humidified airwith 5% CO₂. Plasmids were prepared with the QIAGEN™ plasmid kit.Lipofectin reagent and PLUS reagent (Gibco BRL) were used according tothe manufacturer's protocol. Briefly, cells were plated at 2×10⁵cells/well in 6-well plates the day before transfection. After 24 hours,5 μl of PLUS reagent, 0.5 μg of chimeric construct, 0.5 μg of androgenor glucocorticoid expression vector and 0.05 μg of pRL-CMV, were dilutedin 100 μl of DMEM and incubated for 15 minutes at room temperature. Thetwo solutions were combined, gently mixed, and incubated for 15 minutesat room temperature. Four or eight μl of Lipofectin reagent was dilutedin DMEM and incubated for 15 minutes at room temperature. The twosolutions were combined, gently mixed, and incubated for 15 minutes atroom temperature. While complexes were forming, medium was replaced with800 μl of fresh DMEM. Following incubation, the transfection mixtureswere added to the wells. Cells were incubated for 4 hours at 37° C. at5% CO₂. After incubation, medium was replaced with 2 ml of DMEMcontaining 10% FBS and appropriate hormones. After 24 hours, cells werewashed once with phosphate buffered saline, 500 μl of passive lysisbuffer (Promega) were added and cells were incubated for 15 minutes atroom temperature in a shaker. The cell lysates were transferred to freshtubes, centrifuged at 12,000 rpm for 30 seconds to remove debris andstored at −80° C. For efficiency control, Renilla luciferase activity(pRL-CMV) was monitored.

Example 7 Transgenic Mice

[0223] The chimeric gene comprising the 5.3 kb EP17 promoter regionfragment and the CAT reporter gene was excised from the pUC18 vector byrestriction enzyme digest. DNA fragments were purified on a 0.8% (w/v)agarose gel using the AgarACE™ enzyme (Promega). Transgenic mice (strainB6D2; Harlan Sprague-Dawley) were generated by microinjection of the DNAinto the male pronucleus of a fertilized oocyte using standardtechniques (Palmiter and Brinster(1985) Cell 141:343-345). Sevenindependent transgenic lines carrying the CAT reporter gene wereobtained. Caput epididymis-specific CAT activity was detected in threetransgenic mouse lines. CAT expression was restricted to the initialsegment of the caput epididymis as observed for the mEP17 gene. Thus,the 5.3 kb fragment of the mEP17 5′ flanking region is sufficient ofregion-specific expression and can be used for heterologous expressionin the initial segment of the caput epididymis.

Example 8 PCR Assay of Transgenic Mice

[0224] Transgenic animals were identified by PCR-based screening usingDNA isolated from the tail of each animal. Approximately 1 cm of thetail was digested overnight at 55° C. in a Proteinase K digestion mix(10 mM Tris-Cl, pH 7.5,75 mM NaCl, 25 mM EDTA, 1% SDS, 0.5 mg/mlProteinase K). DNA was extracted with one volume ofphenol/chloroform/isoamyl alcohol (25/24/1) and precipitated at roomtemperature with two volumes of absolute ethanol. Samples werecentrifuged at 10,000×g at 4° C. for 15 minutes, washed with 70%ethanol, centrifuged at 10,000×g at 4° C. for 15 minutes, and dried for2 hours at room temperature.

[0225] 500 ng of genomic DNA were mixed with 1×PCR buffer II (PerkinElmer), 2 units of Taq DNA polymerase (Promega), 1.5 mM MgCl₂, 1 μMconcentration of each primer (primer 1, SEQ ID NO:8; primer 2, SEQ IDNO:9; casein forward primer, SEQ ID NO:10; casein reverse primer, SEQ IDNO:11), and 0.2 mM dNTP. DNA fragments were amplified for 30 cycles (95°C., 1 minute; 50° C. 45 seconds, 72° C., 45 seconds) and 1 cycle (95°C., 1 minute; 50° C. 45 seconds; 72° C., 10 minutes). PCR products wereanalyzed on a 2% (w/v) agarose gel.

Example 9 CAT Enzymatic Assay in Transgenic Mice

[0226] To monitor CAT activity, organs were dissected from a transgenicanimal and homogenized by 20 strokes with a B pestle in a glass Douncehomogenizer in 200 μl of0.1M Tris-HCl, 0.1% Triton X-100, pH 7.8.Insoluble material was removed by centrifugation (14,500×g at 4° C. for5 minutes). CAT assays were performed by the two-phase flour diffusionmethod as described previously (Nachtigal et al. (1989) Nuc Acid Res17:4327-4337). Briefly, cell lysate (50 to 200 μg) is added to ascintillation vial with a lysis buffer to give a total volume of 200 μl.The solution is heated to 65° C. for 10 mintues, cooled to roomtemperature, and a reaction mix (75 μl), containing 2 μl ³H-acetyl CoA(Amersham Pharmacia Biotech), 50 μl of 5 mM chloramphenicol (in water),7.5 μl of 1M Tris-HCl (pH 7.8) and 15.5 μl of water, was added. Thereaction mixture was carefully overlaid with 3 ml of organic phasescintillation cocktail. After 30 minutes, the samples were counted forat least 5 minutes. Quantitative values for CAT activity were determinedby regression analysis to give counts per minute, per mg (cpm/min/mg) ofprotein cell lysate.

Example 10 In Situ Filter Detection

[0227] About 10⁷ λgt11 clones of a cDNA expression library are preparedfrom RNA containing poly(A)⁺ RNA of the mouse distal caput epididymis.Clones are plated and replicated on nitrocellulose filters. Afterdenaturaion and renaturation, the filter-bound proteins are screenedwith a concatenated oligonucleotide probe containing the nucleotidesequence of the cis-DNA regulatory element containing the nucleotidesequence of the cis-DNA regulatory element. The probe is prepared bynick translation with a specific activity of >10⁸ μg. Duplicatescreening using a probe carrying a mutated cis-DNA regulatory element iscarried out to eliminate false positive clones.

Example 11 Mouse EP17 Knock-Out

[0228] To recover genomic DNA sequence necessary for homologousrecombination, a 129/SvEv mouse genomic DNA library was screened usingmE-RABP cDNA as a probe. BAC clone 170K23 was isolated, having 5.3 kbflanking region and all exons of the mEP17 gene. The targeting vectorcomprises a 5.3 kb EcoRV-SalI fragment of the 5′ promoter region and a1.9 kb 3′ flanking region (FIG. 9). The entire mEP17 coding region isreplaced with a PGKneomycin cassette from the pLNTK vector (Gorman etal., 1996), so that the PGKneomycin cassette is positioned between the5′ promoter region and the 3′ flanking region. The targeting vector islinearized using an appropriate restriction enzyme, and the linearizedvector is electroporated into TL1 embryonic stem (ES) cells. ES cellsare selected based on demonstrated resistance to geneticin after 24hours. Resistant cells are further screened by Southern blot analysisusing a probe designed according to sequence of the targeting vector.

[0229] Clones bearing the transgene are injected into blastocystsaccording to standard procedures (Joyner (1993) “Gene Targeting—APractical Approach” IRL Press, Oxford). Chimeric mice bearing thetransgene are crossed with C57BLU6 females and agouti offspring areanalyzed by PCR and Southern blot analysis for presence of the targetedallele. mEP17 homozygous mutant mice are obtained by crossingheterozygous mice having one native allele and a knock-out allele. mEP17homozygous mutant mice are confirmed as such by demonstrating a loss ofmEP17 expression by standard methods, including Northern blot analysis,RNAse protection assays, Western blot analysis, andimmunohistochemistry.

Example 12 Recombinant Production of EP17 Protein

[0230] The mature protein coding sequence was cloned into theprokaryotic expression vector pBAD/gIII (Invitrogen). To simplifypurification, the pBAD/gIII vector encodes a leader peptide whichdirects the recombinant protein into bacterial periplasmic space,thereby minimizing any potential toxic effect. The pBAD/gIII vector alsoencodes a C-terminal polyhistidine tag for detection with an anti-Hisantibody and for purification with ProBond resin (Invitrogen). ThepBAD/gIII vector carrying the mEP17 coding sequence was transformed intoE.coli according to the manufacturer's conditions. Transformed E.coliwere cultured and recombinant protein was extracted. To confirm theproduction of mEP17 protein, protein derived from transformed E.coli wasresolved on a polyacrylamide gel and Western blot analysis was performedaccording to standard techniques.

Example 13 In Vitro Binding Assays

[0231] Recombinant protein is obtained, for example, according to theapproach described in Example 12 herein above. The protein isimmobilized on chips appropriate for ligand binding assays. The proteinimmobilized on the chip is exposed to sample compound in solutionaccording to methods well known in the art. While the sample compound isin contact with the immobilized protein, measurements capable ofdetecting protein-ligand interactions are conducted. Measurementtechniques include, but are not limited to, SEDLI, biacore, and FCS, asdescribed above. Compounds found to bind the protein are readilydiscovered in this approach and are subjected to furthercharacterization.

Example 14 In Vivo Fertility Assay

[0232] Five wild type heterozygous transgenic and five mutant homozygoustransgenic male mice are individually mated with five wild type C57BL/6females during an overnight interval. Females exhibiting a vaginal plugthe next morning are isolated. If pregnancy occurs, the genotype of theoffspring are analyzed by PCR amplification of tail DNA using primers todetect the transgene. A lower percentage of pregnant females resultingfrom mating with homozygous mutant males suggests male subfertility.

Example 15 In Vitro Fertility Assay

[0233] The method used is essentially that of Wolf and Inoue (1976). Inbrief, male mice are killed, and each cauda epididymis is rapidlyexcised and minced in 1 ml of Toyoda's medium pre-equilibrated at 37° C.under 5% (vol/vol) carbon dioxide in air. The minced tissue is left at37° C. for 30 minutes before the tissues pieces are removed. An aliquotis taken for sperm counting, and the incubation is continued for afurther 30 minutes. Female mice are induced to superovulate byinjections of PMSG and hCG. The female mice are killed, and theiroviducts are removed and placed into Biggers, Whitten, and Whittinghammedium (BWW). Under a microscope, the oviduct is pricked, and thecumulus mass is removed and treated with hyaluronidase. The denuded eggsare washed through three changes of medium before being allotted to 100μl droplets of medium under silicon oil.

[0234] Approximately 10⁵ spermatozoa are added to each drop (i.e. 10⁶sperm/ml), and the dishes are incubated at 37° C. under 5% (vol/vol)carbon dioxide in air for 5 hours. At this time, some eggs are removedand washed by repeated micropipetting before the number of attachedspermatozoa is scored. The rest of the eggs are transferred to freshmedium and incubated for a further 24 hours, at which time eggs arescored for evidence of fertilization and development. Experiments areconducted with spermatozoa from a mutant and a wild type male and eggsfrom a common pool of females.

Example 16 Regulation of EP17 Protein

[0235] The regulation of mEP17 protein expression was investigated usingWestern blotting and immunohistochemistry in castrated mice, castratedtestosterone supplemented mice, unilateral castrated mice, unilateralcryptorchid mice, and busulphan-treated mice. As previously observed atthe mRNA level (FIGS. 15A-15C), mEP17 protein disappeared from theinitial segment two days after bilateral castration and was not restoredby testosterone treatment. Similarly, after unilateral castration, mEP17protein disappeared from the castrated side, but not from thenon-castrated side. These data suggest that mEP17 is not regulated bycirculating androgens, but can be regulated by testicular factorsprovided via the efferent ducts.

[0236] To determine whether germ cell-associated testicular factors canregulate mEP17, spermatogenesis was disrupted using cryptorchidism orbusulphan treatment. One month following cryptorchidism, mEP17 proteinwas not detected in the initial segment of the cryptorchid epididymisbut was detected at normal levels in the scrotal epididymis. Incryptorchidism, the testis and the epididymis are exposed to abdominaltemperature. To distinguish the effects of abdominal temperature on thetestis or epididymis, spermatogenesis was also disrupted by busulphantreatment. Following a 35-day treatment, the level of mEP17 protein wasdrastically reduced when compared to untreated controls. Collectively,these observations suggest that mEP17 is regulated by germcell-associated factors.

References

[0237] The publications and other materials listed below and/or setforth in the text above to illuminate the background of the invention,and in particular cases, to provide-additional details respecting thepractice, are incorporated herein by reference. Materials used hereininclude but are not limited to the following listed references.

[0238] Adelman et al., (1983) DNA 2:183-193.

[0239] Alam and Cook (1990) Anal Biochem 188:245-254.

[0240] Altschul et al. (1990) J Mol Biol 215:403-410.

[0241] Astraudo et al. (1995) Arch Androl 35:247-259.

[0242] Ausubel et al. (1992) “Current Protocols in Molecular Biology”,John Wylie and Sons, Inc., New York.

[0243] Baird and Glasier (1999) BMJ 319:969-972.

[0244] Barber and Fayrer-Hosken (2000) J Reprod Immunol 46:103-124.

[0245] Barton.(1998) Acta CrystallogreD Biol Crystallogr 54:1139-1146.

[0246] Batzer et al. (1991) Nucleic Acid Res 19:3619-3623.

[0247] Bodanszky, et al. (1976) “Peptide Synthesis”, John Wiley andSons, Second Edition, New York.

[0248] Brookes (1999) Gene 234(2):177-186.

[0249] Chomczynski and Sacchi (1987) Anal Biochem 162:156-159.

[0250] Conner et al. (1983) Proc Natl Acad Sci USA 80:278-282.

[0251] Cooper and Yeung (1999) Hum Reprod Update 5:141-152.

[0252] Costa et al. (1997) Biol Reprod 56:985-990.

[0253] Cornwall et al. (2001) in “The Epididymis”, Plenum Press.

[0254] Cubitt et al. (1995) Trends Biochem Sci 20:448-455.

[0255] Diekman et al. (1999) Immunol Rev 171:203-211.

[0256] Feng et al. (1999) J Reprod Med 44:759-65.

[0257] Fidler and Bernstein (1999) Public Health Reports 114:494-511.

[0258] Glover, ed. (1985) “DNA Cloning: A Practical Approach”, MRLPress, Ltd., Oxford, U.K.

[0259] Gorman et al. (1996) Immunity 5:241-252.

[0260] Hall et al. (2000) Hum Gene Ther 11(12):1705-1712.

[0261] Henikoff et al. (2000) Electrophoresis 21(9):1700-1706.

[0262] Henikoff and Henikoff (1989) Proc Natl Acad Sci USA 89:10915.

[0263] Henikoff and Henikoff (2000) Adv Protein Chem 54:73-97.

[0264] Harlow and Lane (1988) “Antibodies: A Laboratory Manual” ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

[0265] Huang et al. (2000) Pac Symp Biocomput 230-241.

[0266] Hutchens and Yip (1993) Rapid Commun Mass Spectrom 7: 576-580.

[0267] Joyner (1993) “Gene Targeting—A Practical Approach” IRL Press,Oxford.

[0268] Kamischke and Nieschlag 1999 Human Reproduction 14(Suppl.1):1-23.

[0269] Karlin and Altschul (1993) Proc Natl Acad Sci USA 90:5873-87.

[0270] Kestila, M., et al. (1998) Mol Cell 1(4):575-82.

[0271] Krull et al. (1993) Mol Reprod Dev 34:16-34.

[0272] Kyte et al. (1982) J. Mol. Biol. 157:105.

[0273] Landgren et al. (1988) Science 241:1007.

[0274] Landgren et al. (1988) Science 242:229-237.

[0275] Lareyre, J. J., et al. (1999) J. Biol. Chem. 274:8282-8290

[0276] Lareyre et al. (2001) Endocrinology 142:1296-1308.

[0277] Li and Herskowitz (1993) Science 262:1870-1874.

[0278] Liedberg et al. (1983) Sensors Actuators 4:299-304.

[0279] Luckow and Schutz (1987) Nucleic Acids Res 15:5490.

[0280] Lufkin et al. (1993) Proc Natl Acad Sci USA 90:7225-7229.

[0281] Luo et al. (1996) Biotechniques 20(4):564-568.

[0282] Luyckx et al. (1999) Proc Natl Acad Sci USA 96(21):12174-12179.

[0283] Madge et al. (1972) Phys Rev Lett 29:705-708.

[0284] Maiti et al. (1997) Proc Natl Acad Sci USA, 94:11753-11757.

[0285] Malmquist (1993) Nature 361:186-187.

[0286] Mendelsohn et al. (1994) Development 120:2749-2771.

[0287] Nachtigal et al. (1989) Nuc Acid Res 17:4327-4337.

[0288] Naz (1999) Immunol Rev 171:193-202.

[0289] Needleman and Wunsch (1970) J Mol Biol 48:443-453.

[0290] Nikkanen et al. (2000) Contraception 61:401-406.

[0291] Ochman et al. (1990) in “PCR protocols: a Guide to Methods andApplications” Innis et al. eds., pp. 219-227, Academic Press, San Diego,Calif.

[0292] Ohtsuka et al. (1985) J Biol Chem 260:2605-2608.

[0293] Ong et al. (2000) Biochim Biophys Acta 1482:209-17.

[0294] Orita et al. (1989) Proc Natl Acad Sci USA 86(8):2766-70.

[0295] Palmiter and Brinster (1985) Cell 41:343-345.

[0296] Paterson et al. (2000) Cells Tissues Organs 166:228-32.

[0297] Pearson and Lipman (1988) Proc Natl Acad Sci USA 85: 2444-2448.

[0298] Postic et al. (1999) J Biol Chem 275(1):305-315.

[0299] Rose and Botstein (1983) Meth Enzymol 101:167-180.

[0300] Rossolini et al. (1994) Mol Cell Probes 8:91-98.

[0301] Saiki et al. (1985) Bio/Technology 3:1008-1012.

[0302] Sambrook et al. eds. (1989) “Molecular Cloning: A LaboratoryManual” Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

[0303] Sauer (1998) Methods 14(4):381-392.

[0304] Saqi et al. (1999) Bioinformatics 15:521-522.

[0305] Silhavy et al. (1984) “Experiments with Gene Fusions” Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y.

[0306] Singh et al. (1989) Biotechniques 7:252-261.

[0307] Smith and Waterman (1981) Adv Appl Math 2:482.

[0308] Sonnenberg-Riethmacher et al. (1996) Genes Dev 10:1184-1193.

[0309] Srivastav (2000) J Reprod Fertil 119:241-252.

[0310] Stoneking et al. (1991) Am J Hum Genet 48(2):370-382.

[0311] Talwar (1999) Immunol Rev 171:173-192.

[0312] Tegtmeyer (1975) J Virol 15(3):613-618.

[0313] Tijssen (I 993) in “Laboratory Techniques in Biochemistry andMolecular Biology-Hybridization with Nucleic Acid Probes”, part 1chapter 2, Elsevier, New York, N.Y.

[0314] Tobin et al. (1979) Proc Natl Acad Sci USA 76:4350-4354.

[0315] U.S. Pat. No. 4,196,265

[0316] U.S. Pat. No. 4,554,101

[0317] U.S. Pat. No. 4,736,866

[0318] U.S. Pat. No. 4,769,331

[0319] U.S. Pat. No. 5,162,215

[0320] U.S. Pat. No. 5,234,933

[0321] U.S. Pat. No. 5,260,203

[0322] U.S. Pat. No. 5,279,833

[0323] U.S. Pat. No. 5,286,634

[0324] U.S. Pat. No. 5,326,902

[0325] U.S. Pat. No. 5,399,346

[0326] U.S. Pat. No. 5,489,742

[0327] U.S. Pat. No. 5,550,316

[0328] U.S. Pat. No. 5,573,933

[0329] U.S. Pat. No. 5,614,396

[0330] U.S. Pat. No. 5,625,125

[0331] U.S. Pat. No. 5,641,484

[0332] U.S. Pat. No. 5,643,567

[0333] U.S. Pat. No. 5,646,008

[0334] U.S. Pat. No. 5,648,061

[0335] U.S. Pat. No. 5,651,964

[0336] U.S. Pat. No. 5,741,957

[0337] U.S. Pat. No. 5,834,228

[0338] U.S. Pat. No. 5,837,479

[0339] U.S. Pat. No. 5,872,011

[0340] U.S. Pat. No. 6,087,111

[0341] U.S. Pat. No. 6,096,318

[0342] U.S. Pat. No. 6,132,270

[0343] Vidal et al. (1996) Proc Natl Acad Sci USA 93(19):10315-20.

[0344] von Heiji (1986) Nucleic Acid Res 14:4683-4690.

[0345] Wang et al. (1998) Science 280(5366):1077-1082.

[0346] WO 93/25521

[0347] WO 97/47763

[0348] Wolbach (1925) J Exp Med 42:753-777.

[0349] Wolf and Inoue (1976) J Exp Zool 196:27-38.

[0350] Worrall et al. (1998) Anal Biochem 70:750-756.

[0351] Yeung et al. (1999) Biol Reprod 61:1062-1069.

[0352] Yuan et al. (1999) Hum Mutat 14(5):440-446.

[0353] Zimmer et al. (1993) “Peptides” pp. 393-394, ESCOM SciencePublishers, B. V.

[0354] It will be understood that various details of the invention canbe changed without departing from the scope of the invention.Furthermore, the foregoing description is for the purpose ofillustration only, and not for the purpose of limitation—the inventionbeing defined by the claims.

1 11 1 5315 DNA Mus musculus 5′UTR (1)..(5315) 1 gatatctagc atgtttccagtttctggata ttttgaataa agctactatg aacacagttg 60 agccagtgtc cttgtgggatggtggagtat cttttgggta tatgcccagg agtgatatag 120 cagggtcttg gggtagaactatttccagtt tattgagaaa ctgccacatt gatttttcag 180 agtggttgta ccagtttttactcccaccag caatggagga gtgttccctt gtgttccata 240 ccctgtcact tgaggtgtttttttgttttt gttttaagat tttttttttc gagacagggt 300 ttctctgtat agtcctgtctgtcctggaac tcactttgta gactaggctg gcctagaact 360 cagaaatccg cctgcctctgcctcccaagt gctgggatta aaggcgtgcg ccaccacacc 420 cggctgtttt aagatttttttaaaaaaaga tttatttata ataataaata tctaagtaca 480 ctgtagctgt cttcagacacaccagaagag ggcatcagat ctcattacag atggttgtga 540 gccaccatgt ggttgctgggatttgaactc aggatctttg gaagagcagt cagtgctctt 600 aaccactgag ccatctctacagcccatcac ttgagttttt gaatcttatc cattctgaca 660 gttgtaagat ggaatctcagagtccttttg atttgcatct ccctgataac taaggacgtt 720 gaacatttct ttaagggcttctcagccatt cgagattcct gtttaactag gaaactcttt 780 actaactcct ttcttgcctgtcacagccac tgtgaagggt tcacacataa tcgcttctct 840 gcatcgcgta ggtcctaagaccccagtccc caggccctca aggccttttc agacttctac 900 ccaacaatgg gtcttctgggtaacatgatg gtaacaccac ccaaatcagt tactagttga 960 gctagcccac cagagcacctgcctaggtca tcccttcagt tctggagcat gtaggtctac 1020 attgcctaga cccccttctatctccagcac cggtgaggag cccctgatgg ggataattct 1080 ctaaccagct ctgtggggattgtaggggtg agcaggaccc aagcacctag tgggtgtgtc 1140 tgagagaaag gaggaaaatcaatgatcttc ccatgtgcgt ttctctgggt aaggaggtac 1200 tcaggtgtcc tgagtgaccagtgccaacat tctcgggcta tgtgcagggt gagttttctt 1260 ctgagcttgg gtcagcaagatcctatccct gaaatctgtg atccctacta cctgtcgagt 1320 ctcatgctta tggttttagcttagacatgt ctggtagaag atgccaccct ggtgtcttgg 1380 ggccatcata ataagccaaggcaaagagac agaaagaatt ctcttttagt ccaagagacc 1440 gaagtcagga atccatgtgtgtggaggcta gaggagggct cttgccaatc tgcaccccca 1500 acccccagct tctggtctttcttctgttct gggcagtctt aatcttgact gcatcactaa 1560 tatctgcctc tttcttctgcccatctgtgt ctgaattctt ccattcataa ggatgctgtc 1620 tttgggttag ggccatctgaacctgtctgc agaaaccttg tcttcaagta agatattgtt 1680 tggaggttct gggtgggggatactattcaa tccagtgtac cccatgtttt gctaaccttt 1740 gggcccccct tctcagggcttgtaccatat ctttaattgt atcatcctgt tcccaagaac 1800 cctgaaaggg cagaactatttttcatctac tgctccacta cagggttagg gtagaagtcc 1860 ttggcatctt gatacaggcaggttcatggg gaggcctacc caaccaaacc aatgtagaaa 1920 gctgcccaat tcctacccaagctcattgct aacaattttg gcactcacaa agccacagta 1980 gtagatgaca tgtgagtgtataaaagtcag tggatagagg gggtgggtgt aataccgtat 2040 ggcttatcct gggatggagactttgtggta tttcttgttc ttttcccagc ctcctagttc 2100 tctgcatgct ggtactttgcaggtcatgat aggcctctta ttatcctttc ctcttgcctg 2160 cttaccctga atgcctgctgttttgacagt ccatggatct aaacacctgt gctgagctta 2220 catgttctcc actatcaggtagttttgaaa acgtaagaag ttagactcaa tgtcaaatca 2280 ccctgtcatt tttcctgttactatggtctt ggtctctctt ctcagggcct ggcaaggctt 2340 aggggttact agtcctgtaagaattaggga tccagaagta tggcaggtct ctatggtctc 2400 tgggaaggtg ccttggagaggcatctctag tcgggtcagt ggcagtgtgg gaagttttta 2460 ctggtttata gatgctttggtgacaggatt atctttcctt gcctttgtgg gttggcagcc 2520 ttggggggca gtgcctagcttgtggggagt atggaggatg tttgacaaag gactggtgat 2580 ggagtgattg tgcccttttagggtaggatg aacctggagt tgccaacctc aagagtgtca 2640 ctctcactgt gatctttagggactgagaga tcctcttggg tcccacctcc ttccgtcctg 2700 tgttccttct cttgctcttctcctttcacc caataaacag attcagcatc cgcatggctc 2760 tttgaccctt tggagatgcccttgggcaga tcatagtgaa ctctttcttc atcccagccc 2820 tctggccact tagggttctcctcgagagtg ggcaggatgt ttcattcaaa tttggtgaag 2880 ggaggttgct gcatccacagggaagactga gcacagcata tgtggtatcc acgaggtcac 2940 agtgtctttg aaaccaaggtgcctactgag gatgggttgt aggctctagg ctcccttggg 3000 tctcagaaat tgtatgtatggatatatgaa ctggaaagca ggagggccat aatgctaggt 3060 tttgattccc atctcctcctttgatcttgg acaaatggtt tcgacctttg ggagcctaag 3120 aaataggtca gggcagtggtcactgtgctt gaagaccttg ctaggctagc tgtcatgtga 3180 tttccatgac agacctcctggatacacata gctagctgca tccattcttt tttctcagat 3240 tgagacactg aggcacagaagctggacagg ttactcaagg tcacacgatg gacaaattca 3300 ggggtggcat gatgtagtctctgtgctgct cttgaacaag atgtggggta cgggggttcc 3360 tttccaggta ttatcactgctctcgtcttc atgtcacaga attcccattc tccagtggat 3420 ctagttctac ttcaaagtccaggcatggaa acaggaaggg gctggttctg ctccattgct 3480 tatctgtctg tggtcctcagacaacaagaa ctggaagcaa ttagccctca tagctgctgc 3540 cctcatcctg tgccagtcacccagagaacc aggtgccatc cacccacatc tctttccctc 3600 actccagcca aacccctctctgtatcccca agggccctgg cttgtcagtc tgttgttggg 3660 accagcaggc cctcagcttccctcccgatg ccccaaagtc tttggatggt tccttctagt 3720 gctcactgtg cattctctcctaactcatct cttatgccgt cctcacaggc caccctaatg 3780 tcttctgtgc ctggttccctctggagggtg ggcttcagtg gatttcctgt gagctgtcca 3840 ggtcttgagc agctccaggtgggtccaaaa gattagactg ttcctacagg atgaggggtg 3900 ctgacctggg tagagctgttttggggtgcc ctgggacact caagacctat ttctctgcta 3960 cacaagttgg aggttagagatttgggtcac aaggtgacta agtcattttg ggggtgtcag 4020 tcagggaggg ctttgcattcagggtgcttc atggtccctg tttatcctgt ttctttcctt 4080 tttatttgga ggaaatagacaattgtgaga aaaactaata tttttagttt ttttattttc 4140 ctagcgaatg ggagactttgttttagtatc aatcacaaat aaagttgatt ttaaagataa 4200 aagttgtctg ccttaaaaaaattctctgtg gtctgagagg tgtttgtttg tttctgagag 4260 gtgtgaaaaa ccttgtcgttggggctggga agatagctca gctggtaaag tactttcttc 4320 agaacccata taaaagccaggcatggtggt gtgtatttga aatcacatgg tgagtgctgg 4380 acagacagag atagtgggatctctggggct cactggctag gcagcctaga ctactcagca 4440 cgcttcaggt tagtaagaaactctgtctca aaaaataagt aaccagccaa ttgaccaacc 4500 aatcaataaa ctaactaactagctagctag ctagctagct aactaactaa ctaactaact 4560 aactaactaa tgaagcaaaacaaatgaaaa cacaaagtga cttgtactaa aggaacaaca 4620 cctgaggttg aactctggcttccacatcca tgtgactatt tgtatgcttt ccccaaccaa 4680 gaacaaagac atgaacactatatacacaga gacacacaga cacacaaata ccacaaacat 4740 acacaccaca cacaaacaccacatacaaac atacatacca cacacaaata ccacaaacat 4800 atataacaca cacatacacagataccacat acaaacatat ataacacacc acacagatag 4860 agatactata taaaaaaatatacacacaca gaatggaggg aatgcagaat gctgcaaggt 4920 ataaggtata aggttcctggttggagactc aggcctcctt gttcagttca atattttgtt 4980 attgttttaa gcttattgttacaaatggag aaactgagat aaatagcttg tccagttcat 5040 agctcagagc gtggggctggacttggttgg tgcagtctta ctgcatagtt catcactgtc 5100 cacatgctag gatggaggcagcttaactgt catcttagct tggtcctaca cctctctgga 5160 tgggggttga tagcatttgagcagaagctg agtctctgag cagctgacag ccagctttgt 5220 ccaatgacat tctctaagtggttgcacatg cttgcacact ctccaaatat aagctcccac 5280 cttgcataaa cagaagccacaagccaggcc ctgag 5315 2 30865 DNA Homo sapiens 2 tgtttatttt tattattttttttttttaaa tttatttaac ataattttta ttaataataa 60 taataaatat agtttgtaaattttaaaaat aaaagagaag agataaagaa tagattaaat 120 gaattatata aatgaattaaggtaatactc aatcctttgg atatgtggga ttatacacca 180 tataattttg ttatatttaatattagtata tatggtaggg tgtatgtgag aaaaggtaaa 240 acttaaaaaa gcggtgaattacaagatcgg taacgaaggt cgaaaagaaa ggaaatagcg 300 ggtgtctggt gatgtcttggatttgagggt cgggggtttc gaactctttt taaggtccat 360 tatttaggcg cggcggggagagaggtatgt aaaaccgcaa gatggggtcc ttatcaggat 420 aaggaccacg agtgactgttagcagaaagt aatatcatga ggaaacagta aaaaaaatac 480 ttatgagttg ggttgggaggggactgagct ctgagaaagg accccagttc actatgtcaa 540 aaacgaaaaa aaggcaaggcgcagtggctc acgcctgtaa tcccggcact ttgggaggcc 600 aaggagggca gatcacctgaggtcaggagt ttgagaccag cctgaccaac atggtgaaat 660 cctgtctcta ctaaaaatgcaaaaaaaatt agacgggcat ggtggctact gttatcccag 720 ctacatggga ggctgaggcaggagaatcgc ttgaacccgg gaggtggaga ttgcagtgag 780 ctgagattgc accactgcactccagcctga gtgacagagt gagaccccgt cttaaaaaac 840 aaaacaaggc caggcacggtggctcatgcc tgcaatccca gcactctggg aggctgaggc 900 aggtggatca caaggtcaggagatcgagac catcctggct aacatggtga aaccctgtct 960 ctactaaaaa tacaaagaattatcccagca tggtggtgga cgcctgtggt cccagctact 1020 ccggaggctg aggtgggagaatggcgtgaa cccaggaggt ggagcttgca gtgagctgag 1080 attgcgccac agcactccagcctgggcgac agagcgagac tccatctcaa aaaaagaaaa 1140 aaaaaaaaaa aaaaaaaagaaaaaaaagaa aaaagaaaaa aagaaaagaa aacgaagaaa 1200 aaccacaaat aaataaaagttgaggtatga tgtatgcaca gctgcattcg cccctttgag 1260 ggcacagttg tgtgagtttagacaaatgca aatgttcatg gaaccactgc aagtgcgaca 1320 cagcagctct ggtgcccacaaagcttcctc gaggtccctt gcggtcaatc actgcccgca 1380 cctcagccct cagagccactgggaagcccc ctgccctgtg ggctcacctg tcctagaacg 1440 tcgtataaaa ggagtcgcagatcgcagcct tttccacctg gcttctgtcc ctctgcaggc 1500 tgcacctggg attcagccaccttcgttcct tttactgctg agtgtccatc acgtgctgct 1560 catgcagaag tgctcaagttcgctgtaaac gtagttttca tttctcccgg gtggagacct 1620 ccataagtag gtaggttggtggaatgtttc actgcatgag agtctgccag gaagttttgc 1680 agtgtggtcc cgtcttgttcttcgcgcaga gggtgatagc tctgctgact cccagccctc 1740 acaggctcta gggattgagcctgtttctgt ggaacgcctt ccatggtctg cgaggtgcac 1800 agaggggtct caaaaagcttccaggttgca tcttcccatg gactgatgct gagcgtgttc 1860 tcatctccca tccggatgttttctttggtg gagtatcttt tcagatctct tgccttttta 1920 tttttatttg tttatatatatattttgaga cggaattttg ctctttcact taggttggag 1980 tgcaacggcg cgatcttggctcactgcaac ctccccccac ccacctcact ttgggttcaa 2040 gcgattctcc tgcctcagcctcccgagtag ctgggactac aggcgcccgc caccatgccc 2100 ggctaatttt tgtattttgagaagagatag ggtttcacca tgttggccag gctggtctcg 2160 aactcctgac ctcaggtgatccacccgcct cggcctccca aaatgcgagg atcacaggcg 2220 tgagatgcca cacctggcctctttttattt ttaatggagt ggcttgtttt cttattactc 2280 agtcttggga gttcctcatggatgctggat ttcagtctcg tctcaggtgt atgcgttttg 2340 tagatgtttt ctgtagggaccagccccaca gggttggtga gtttctccct gtgtgctgag 2400 atgagagggc atagaaataaggacacaaga aaaagacata aaagaaaaga cagctgggcc 2460 tggggaccac taccaccaagacgtggagac cggtagtggc cccgaatgtc tggctgtgct 2520 gatatttatt ggatacaaagcaaaagggac agggtaaaga gtgtgagtca tctccaatga 2580 taggtaaggt gacgtgagtcacgtgtccac cggatggggg gcccttccct gtttggcagc 2640 caaggcggag agagagagagagagacagct tacatcatta tttctgcata tcagagactt 2700 ttagtacttt cactaattgactactgctat ctagaaggca gagccaggtg tacaggatgg 2760 aacatgaagg cggactacgagcgtgaccac tgaagcacag catcacaggg agacggttag 2820 gcctctggat aactgcgggcaggtctgact gatgtcaggc cctccacagg aggtggagga 2880 gtagagtcct ctctaagctcccccggggga aagggagact ccctttccct gtctgctaag 2940 tagcaggtgt tttcccttgacactgacgct actgctagat cacggtccgc ttggcaaccg 3000 gtgtcttccc agacgctggcgtcaccacta gaccaaggag ccctctggtg gccctgtccg 3060 gcgtaacaga aggctcgcactcttgtcttc tggtcacttc tcactatgtc ccctcagctc 3120 ctatctctgt atggcctggtttttcctagg ttataattgt agagcaagga ttattataat 3180 attggaataa agagtaattgctacaagcta atgattaata atattcatat ataatcatgt 3240 ctatgatcta gatctagtataactcttgtt attttatata ttttattaaa ctggaacagc 3300 tcgtgccctc ggtctcttgcttcggcatct gggtggcttg ctgaccacag ttttcctgct 3360 gtatctaact gctggcttaggtgaacctgt caggggcgtg gctgggaact ggaacctcct 3420 gaagaccagt ggaggccacaagcagaagcc ctcggctcct tcccttcccc gatgctgggg 3480 gccaagggtc tttctctgaccctgtccccc cctcacgtta ccagtggagg gtgtccaggt 3540 tcttggcatt ttgaacaaagaattggacaa aacacacaaa caaagcaagg aaagaatgaa 3600 gcaacaaaag cagaaatttattgaaaatga aagcactctt tacagggtgg gagcgggctg 3660 agcaagcagc tcaagggccccggttacaga attttctggt gtttcaatat cctctagcgg 3720 tttaccattg gttacttggtgtatgcccta tgtaaatgaa gaggatgaag tcaaagtcat 3780 tttctcggct gagcatactgtcacgcctgt aatcccagta ctttgggagg ctgaggtggg 3840 tgtatcacct gaggtcgggagttcaagacc agcctgatga acatggagaa accccgtctc 3900 tactaaaaat acaaaattagctgggtgtgg tggcgggtgc ctgtaatccc agatagtcag 3960 gaggctgagg caggaaaatcgcttaaaccc gggaggcaga ggctgcagag agccaaggag 4020 tgcaccactg cactccagcctgggcaacaa gagcaaaact ccatctcaaa aaacaaacaa 4080 acaaaacgaa gtcatgtactcggtatgagc cctacgtaaa cggagaggat ggagtgaagg 4140 tacaaagcca ttcacattcccgtcatcgtt agtgtttcca gttgatttgg ttctaggatg 4200 cccttgggct ccctgcgtccaggccctctt ctcctgcctc actctcactc cagaggactg 4260 gcagtgctgg cctccctggcaggactcccg cccctgacca gaggggtctg ggcgtgctcc 4320 gaggccacct gctgctccttgcccagtgtc cctgggtgct ctgtggggac acagagaggt 4380 cacttgggag gttcctgcctctgccagact gcggcgcctt cctggcgtca ttgccccctg 4440 cacgcttata caaccagtgataaggcggta tccaatgccg taatcctccc cagggccagc 4500 ccagccctgt ccggtcccgtccggtccctg gcctgcttgc tgggagccca gctgggggct 4560 gcatcggggg tggtgggggtctggagtgag gggtgcccag gcctctctca cagtgtgggg 4620 gcacaagctc tgccctggacactctgccag tgtgacccta ccccccatgt tgggggtgag 4680 agaagggggc tttctgggtggccagatccc aggagagggg gcctagccag accaacccca 4740 gcccagctct ctgcagctgtaccccaaagg ccctggccca gcccagccag acagagtgtg 4800 gacagagggg cttgggtgcgagtccaggca aggcaggaca agtctctgct gtggcttggc 4860 agtggccccg tgactggtgacatagtctgg caggcctagg ggacaaatcc tcagcccggg 4920 tgcatcaggc cagagcctggctgtctggag ccctccagga aatgaccccc cgggctgggg 4980 gacatccgag tgattgtgccaaacaatgga caagggggcc ggagtctcag agccgaagtg 5040 ccttccgctt ccatctgctccgcccaggag caggtgcacc caggtggtgg cctgggctat 5100 aaagctggcc ccctggggcttggggactca gcaccagggg ctggagggca ggggagggga 5160 tgatgtcatt cctgctcggcgcaatcctga ccctgctctg ggcgcccacg gctcaggctg 5220 aggttctgct gcagcctgacttcaatgctg aaaaggtacc aggggcctct gctgtcctgt 5280 ggtgggtggg agctgggcccctgccagaga caacgtgata attgtgacaa ctgcctttcc 5340 tggggcgctt gctctgtgcccggccaggca cgtgctatag acccactctg cgcttaatcc 5400 taaaccaacc tgtgaatggatcattactgc acccacttca cagctgggaa ctgaggctca 5460 cagttgcact gtgacctacccaagatgagt tcctgtccac ctgggctcag ctcacaccca 5520 agagatggcc actccgagaccccttgctgt gtgacttctg agttgtcccc tgggtccctg 5580 ggcaaggagg gccctgctgctggctgtcct ggcctgcggg tgactaagcc cccggcagtt 5640 ctcaggcctc tggtacgtggtctccatggc atctgactgc agggtcttcc tgggcaagaa 5700 ggaccacctg tccatgtccaccagggccat caggcccaca gaggagggcg gcctccacgt 5760 ccacatggag ttcccggggtgagttacttg ggctgggctg cctgggctgg gctggggagc 5820 tgaggtagct gggtgccatgagtgagccgg ctgtgcaccc ccagggcgga cggctgtaac 5880 caggtggatg ccgagtacctgaaggtgggc tccgagggac acttcagagt cccgggtagg 5940 tcctgcctcc tctgccttcaggggtgctgg atgccgcttc tctgggggtg acgtggggct 6000 gactttccac ccgggcccaagctgtccatc ccagtctcca gccccctgac cgccctcccc 6060 gcccctcctc tgtcttcggggaacaggctc acagcccagc agcaggggca ggcatgggtg 6120 gggtcggtgc ccgggcccatccacttctcc agcctgaaaa ggccgcctgc ctctccgcag 6180 ccttgggcta cctggacgtgcgcatcgtgg acacagacta cagctccttc gccgtccttt 6240 acatctacaa ggagctggagggggcgctca gcaccatggt gcagctctac agtgagtgcc 6300 tgtcccacct ccccgcactggcccctcccc aacttcccac aggccccgcc ccccgttctg 6360 cctacccccg ccccgcccgcctgccgcccc gcgcctcctc acaggccccg ccctcacccc 6420 tccgcccccc gccttttaggccgagtttag gctccgcctc ctcatccccg ggcccgcccc 6480 ttctggtgct gccccatgccctctcctgct gtgacccttc gcacacactc ctggcctggc 6540 caggaccggg tggggtgggagctgggcccg ccgtgctttg gatgtggagg aggttggctt 6600 gggggagggt cccgcggtgcatgctcctct ctggcctcct gagagtggtg ccccgccctt 6660 ccctgcctgg cctcgctgcctgagttctgg ggctggggct gaggttcttt tttttttttt 6720 tttttttaat atggagtctcactctgtcgc ccaggctgga gtgcagtggc gcgatctcgg 6780 ctcactgcaa cctccgactccccagttcaa gcaattctcc tgcctcagcc ccccgataag 6840 ctgggactac aggcgcccatcaccatgccc gtatttgtat ttgtaaaaat ataaataatt 6900 ttttgtattt ttagtagagacggggtttca ccgtgttagc caggatggtc tcgatctcct 6960 gacctcgtga tccgtccgcctcggcctccc caaagtgctg ggattccagg cgtgagccac 7020 cgcgccggct ggggctcaggttctttaact cctggggact ccccgattgg agccgacaga 7080 gtgaggctgg aggcttccgggctgcaggga aagaggctca aggggtagtg gggcccagga 7140 ggacagccag accaccccaacctcccagac atcttggtgt gaattgggga agggggcccc 7200 aagtttcctt cagccaggcaggagctttat gccgcgcaca acctgcccaa cccagggtgg 7260 cggccttgct gtcagccctcaggctggtgg agtgaggcca ggagaccccg tgctgcactc 7320 ctgtgagctg ctgagttcacagtgacctct gctctgccca ggccggaccc aggatgtgag 7380 tccccaggct ctgaaggccttccaggactt ctacccgacc ctggggctcc ccgaggacat 7440 gatggtcatg ctgccccagtcaggtaccct ggcaagcccc gcccgcccca tgccccccag 7500 cctctccctc ccttcccatgagcccaccct tccttcccct cacagcacag gcagtgtgcg 7560 ggcagcaggt ggtcttctcttccaggggtg ggctggcagc ccctgcagcc cctaccccac 7620 ctcccccaga acagactacgggagaggctt tgggcagcgc tgggagggaa gctctgggct 7680 ccactctctg gggatggtggagctgtccag tggcccttgg cacccagggg ttgggggctg 7740 cgggaagcga gggcggtcctggccgtcagc atctcacgga tgtcttcctc cccacagatg 7800 catgcaaccc tgagagcaaggaggcgccct gacacctccg gagccccacc cccgcccttc 7860 ccaggtgggt tctccaggccctgcagggga tgccctggtt gcctctcccc tccctatcgc 7920 agcttgacat ctggttctgtctgccccatg cctgccccgc tgttctgcct ggcgacccca 7980 tgcttcaggc ctcagcttagacattacctc ttccaggaag cctcccttga tttcctaagg 8040 tacccatcac agagtaccacaaatggggcg acttaaaaca gcagaagttt ggccgggtgc 8100 agtggctcat gcctatcatcccagcacttt gggaggccga ggcaggcggg tcacctgagg 8160 ttgggagttc gagaccagcctgaccaacat ggagaaaccc cgtctctact aaaaatacaa 8220 aacttagctg ggcgtggtggtgcacacctg taatctcagc tacttgtgag gctgaggcag 8280 gagaatcgcc tgaacccaggaggcggaggt tgcggtgagc cgagattgca ccattacagt 8340 ccagcctggg tgacaagagagaaactccat ctcaaaacaa acagacaaaa ccagcagacg 8400 tttatctctc ccagttctggaggccagaag tccaaaatca agacgtccac agggctggtt 8460 cctctggaag ctctgaaggagactccctcc cacaccgccc ccccagctcc cacggctgcc 8520 gcttgtcctg ggcatttcttggcctgtgga agcatcagcc ctgtctccgc ctccgtcctc 8580 ccacagccac cttccttctgtgcatctgtg tccaaactgc cccctagtgt ctgtgctgcc 8640 gaagcgagtg ccccacggccctcttcttag gagcatgcct gtgcttggat tcagggccca 8700 ctgtaaatgc aggatgatctaatctccaga ttctaaacta gtggtatctg caaagaccgt 8760 ttccaaataa ggccacgttcggaggttcca ggtgggcctg aagttttggg gacacgattc 8820 aatccacagc tctgtgtccccgctgtatcc acacagcccc tggctccctc cccagggcag 8880 ggtcctgcct tggaattgtgggagccttgt cttctccccc agggcccagg agggcagacc 8940 acagccttcg cgcgcagtgcccctcacagg agccacgtgc gggcggggtc ttgcggagat 9000 ccccccctaa accagacgccgggagaccgc tgggtccctc gccggggctc accgccaaga 9060 atttgggcac agccacacgtcacgtgtctg acgtgacagt caccggcgtc tggagggaca 9120 ctggcccttc ctggcatggcgggaggaggt gggcggctct gaggcggggc tgtttctcct 9180 gcgtttctgc cgtgctgctgttgcgtttcc tgcatctctg ctcctctcca tgagcttcgc 9240 ctccactccc aggaccctctccctggagac tcgccgtcct gcctggggac actggggctg 9300 ttcagctttg cacaagtctgcaccagcgtc tccctgcaca accccgctgg gttgggaaac 9360 atggggggca acaccaaatcgcccttgtcc agaaggttct gtgggcagaa catgtagccc 9420 catcccgcca tgatcttgaagaagcagctt tcggggtaga gtgccccgcc tgggccacca 9480 cgcgtgaccg tggaggctgtgggttctgtg gtggcttcat ggacattccg gggcttttct 9540 tgccatgcag cccccttcaccgaggaccag gcagaggcca tggctcactg ccactcagcc 9600 agggactggg gcatggcactgggtcctccc gcgccgggga aggtggggaa gtgctggagc 9660 cagggatggc ctctggggcagcctctcgcc tggggcctgt ggggcagcag tggtagctgg 9720 tgtgtgtttc caggtttgcaggagttttgg gagcagaatg agctctccct ggcctgtcag 9780 tggtggcttt ggaagagtgatgcccagctt gtggggagca ggaagggcgt ttgtcgcggg 9840 agtgggtgat ggagtaggcatgcgttttct acgcaggtgg agccaaagca gcaggcgcct 9900 ttgcccctgg agtcaagacccacagccctc ggggaccacc tggagtctct ccatcctcca 9960 ccccccgcct gtgggatgccttgtgggacg tctctttcta ttcaataaac agatgctgca 10020 gcctcatggc cctcacctctttggacatgc cctgggggag ggcaatgcgg gcccctgact 10080 gaccccagca ctgctggggagactgaggca gggaggcaag tgtgagaagg ccctggccgg 10140 gtgagcaggc cctggggtgagagggcaggg ggcaggggag aggggccctg ggaaggggca 10200 gggggcaggg gtgaagggccctgggtgagg gggcagggga caagggtgag gggccctggg 10260 tgaagaggca ggggacaggtgtgaggggcc ctggggaggg gcaggaggca gcagccactc 10320 ttggcctgtg tgggggctgcacccacagga agaatggagc atcctggccc agtccctatg 10380 ggtctgcagt gctgtggggaactgggcacc ctgtgaagac ggcccggcca ctggggctgt 10440 gctgggcaca caggccccggtgctggggcg gggagctggg ggagctgcag gaggccgggg 10500 agttaggccc acttgggttttgcccctcac cactgatatg atcgtgggca cctggtttac 10560 acccctggaa atgggtcaccgttgtggcca ggggactgcc agccagcctg agcgctccac 10620 agcagaccca cagacataggtagctacccc gcgtactcct cgtttcaaag atcaggaaac 10680 tggcctggcg cggtggctcacacctgtaat cccagcactt tgggaggccg aggcgggtgg 10740 atcacttgag gccaggagtttgagaccagc ctggccaaca tggcaaaacc ccgtctctac 10800 taaaaataca agacttaaccgggcgtggtg gcgcacacct gtagtcccag ctactcagga 10860 ggctgaggca ggagaatcgcttgaacccag gaggcagagg ctgcagtgag ctgagattgt 10920 gccactgcac tccagcccgggcgatggagc aagactctgt ctcagaaaaa aaaaaaaaaa 10980 aagaaaaaaa aatcaagaaaccgaggcaca gaagggttgg gtaacttatc caaggctgca 11040 cagccaggaa gaggccaggcaggctgcaag cctgggtgat caggctccag ggccgggctc 11100 cacactgccc ccgggtggtgatgcctggtg cagcagaggc cctccccact gtcactgtca 11160 ctctggtcgc tggcgctgtgcctccggcct ccggagctct gactgcccac tgcgtctggc 11220 acacatttca aagggccaggcgtggggtgg cacaggaaag tgctggaggc tctggtccgt 11280 tgcccatcag tgcccacagccctcgcagcg caggccagaa cgagcccaaa aaaacaagtg 11340 tcatgtgggg tgggggtcaccgagccccag gtggtgtggg tggagctcct cttcctctca 11400 ggctccgaga cggccccagcagcgtccacc gctgtccata cgccaggagg gtggctgggc 11460 aggctgctgt ctaggccaggctaacccctg cggtgggcgt gggtgtcacc agggccgatg 11520 gcgcttgtgc agaaacccacgtctctgagc tgccagcagc caagctgtcc tgatgacatt 11580 cccgggtggg cgcacaagcctgcactgtcc gtatagaatc ggcccaggct gtgcagcagg 11640 ggaacccgga gcccggaccccgccacggag gccaggctgc cgtgcaccat cctgggtgtc 11700 ctcgtggtgc tccgggcgcaggtggcagca gccatggagg agctggaccg gcagaaggtg 11760 gcttctcctt cttagttcgtggggctcctc ctacaccccc aacccctcag gctcagggaa 11820 ggaggcctct cccggtgtggggagctcgtg gggacgctgg tgcccggcta gacacttcct 11880 gttagcggca ttttcttctccgctgagtct gtgccggctg ctgggccaga ggcacattag 11940 caggcccaga gaaggtagatgccggagacg aagattcttt cctcccgaaa atggtagggt 12000 ttttaaaagt ctcagcggaagtcccggctc tgggccggtt gctgagggca ggaggcccat 12060 cccctggcgt ggttggcaggctggcgagct gcgctacccg agccacctgt tctctggcgt 12120 ttctcactcc gccgcgccctgcgggctttc ttttccaggc ccctgctcct gggtcctgcc 12180 tccgaggtca ggcagggcctgtggttcctc ccgacatgtc gcagaagccc cagggactgt 12240 tccgcagctc tagatggcccagtggggagg ggctgccctg tgggcattgc tgtctgatgg 12300 cctgaaggca ccgcttggagggacacatgc ctggggacag tgggctcaca gatgtcttgc 12360 tgctttgcca ccgagcctcacagccatctg ctgacctctc agagcccagc aggcccctgc 12420 ccgggggttc gtggaatgcccctgggggtc tcagacccac tgctcagctc ttggccaggc 12480 tccgtatctc tctagattggaggattctgg agggaagtcg gtgtggcctc cgatcaaagc 12540 ctggtgctga cggccccgaagcgggtggag ggcttgttcc tcaccttgag cgggagtaac 12600 ctgaccgtga aggttgcatataacaggtaa gaggcctggg ggctgctgga gaggaagagg 12660 catgcatggc cggggactgggcgtgggttg ctgcacccgc tgcagcgtgg ggtctgggct 12720 gagtgccgtc ccgctggtgggccagcactc ccggtgcccg tctgcctgag ccctcgaccc 12780 ccaggctcct ggatctgcattttatcccaa atataagcat agtgttttaa tgaagccccg 12840 tgaacttaga aggaaggaaaaggggtgtgg gacaggtgac tggctggagg gaacagaatc 12900 cccaggggaa ggccccacctggcacattcc ccgcctcatc ctgctccagg ccctgggata 12960 ctccacagtg ctgcctgtgcccttcctaaa gacacagccc cactccggga aggagcccag 13020 caggcgggac acaggctagcattttaccga tgggcacttt gctgcctttc agctcaggaa 13080 gctgtgagat agagaagatcgtgggctcag aaatagacag tacgggaaaa ttcgcttttc 13140 ctggtaagtg cagttgccctgtgatggcag gtggaacccg gctgtgcaca cagctaggcc 13200 ttattgttcc ccatgctgttccctgcactg ttccccatgc tgttccctgc actgttctct 13260 gtgctgttcc ctgcactgttccccatgctg ttccctgaac tattccctgt gctgttcccc 13320 atgttgttcc ctgcactgctccctgcactg ttccacatgc tgtttcctgc actattcccc 13380 atgctgttcc ctgcacttttctctgcgcca ttccccatgc gttccctgca ctgttccctg 13440 cactgttccc catgctgttccctgcgctgt tccccatgct gttccctgca ctgttcccca 13500 tgctgttccc tgcgctgttccccatgctgt tccctgcgct gttccccatg ctgttccctg 13560 cactgttccc catgctgttccctgcaatgc tccctgcact gttccccgca ctgctccctg 13620 cactgttccc catgctgtttcctgcactat tccccatgct gttccctgca cttttctctg 13680 tgccgttccc catgcattccctgcactgtt ccctgcactg ttccccatgc tgttccctgc 13740 aatgctccct gcactgttccccgcactgct ccctgcacta ttccccatgc tgttccctgc 13800 acttttctct gtgccgttccccatgcattc cctgcactgt tccctgcact gttccccatg 13860 ctgttccctg cactgttccccatgctgttt cctgcaccgt tccccatgct gttccctgca 13920 gggttccctg cactgttccccatgctgttc cctgcacatt tcatgcccca gaccttccca 13980 ttctcccacc aacacactggatcatccttc aaaagcttct gtagtgtctc caaccactca 14040 agtgctggga ctgggtgggggcaggatgga gttagaccct gcagaccctg gccttcgagg 14100 tccgtccccc tcagacgtctcccccaacgc catggccggc tcttgaaggc cacagagaga 14160 tccacgtgct ggacaccgactacgagggct acgccatcct gcgggtgtcc ctgatgtggc 14220 ggggcaggaa ctttcgcgtcctcaagtact ttagtaagct tggccctggg gggctctgcc 14280 cagctgctgc tctcccagggactgcccgcc cagcccccct gtgccccaca gctcggagcc 14340 ttgaggacaa ggaccggctggggttctgga agtttcggga gctgacagca gacactggtc 14400 tctacctggc ggcccggcctggtgagccca ggggccttgg ggtggaggct gggctgggcc 14460 ctgtgggctg actctgcagctcctcatgct ggcctatcct gcagggcggt gtgccgagct 14520 cctgaaggag gtgagcttgacccccgaccc tggcctgtgc tgaagttccc gggcccctgg 14580 cccagtccct ggccctgtcaggagcccccg tggctccgcc tcccggccct gggctgggcc 14640 ttctcacccc ttcctgtgaacaggacacca aacaccactg gtgggcagct ccagagatga 14700 gtctgtctcc tggtttggaaagagctggaa cctccagggt ggtgacccta ggctgccagg 14760 cagggaccct gggaggctggggtcacgggg tgcagagctg ggtggggcag gggagcagaa 14820 atggcgcctt ttcttcggtgttccgtgcag gactgccggc tgcttctgcc cccgaaggtc 14880 ccgtcggcgg cggggcacagatcctgcggg cgctgcctca gggctcccat gttgggcact 14940 gcgagaaccc agtgtctccctcacctcgct ttgtcttggc cctagaggct gggcctgtta 15000 ccccattttg cagattgagaaggcgctcag ggagctgggt gctttgcgca aaaccaggca 15060 gcgaggacag aagtcccgccgtgtggccct catcgaagcc ccgtggggcc tccagagacc 15120 acacgggcct gagcccctgcacttctgtgt cgcaggagct gatttaatgg agttcctgcc 15180 tcagaccaca aggttcggagcgcccgccca cccctgcccc tcctgggcac cctgcccacc 15240 aggtcacctg cacctgctctgaataaactg tgaagtcaag ccactgcctg gtgtgtcctt 15300 ccggagggcc gatgggtgacaggtgtgggg ggactcaccc gccccaggtc tggcaacagg 15360 aggtgtgttt tccgtggttactggacaaac agtgcggctc acgtgcaagg cagcaacccc 15420 tgccctcccg ccctgctagggtcgtggtca ggctgcccca gggtatagac cagagacaga 15480 aacagggctc ctggtggagtctcagcaggt ctgtcaaacg catggaggcc acacccctcc 15540 tgcctgcatc ctccatggctccccactgcc cagggctcca ggctcacggc ccgtgctgaa 15600 ggccctaccc gatgggccccgcttccggcc tccctggacc cccagctgcc cagccctgct 15660 cagtgtccgc ttccacagtgggctgtcacc ccgtggcctc ggctgccacc agccgactgc 15720 atcactttct cctcccgcctggcctgggac acgggaaggg caaggaaagt gacagtgtca 15780 gcccagaggt tctgcgcccaggcccctggg gaatcagccg gtccccacat atgcaggtga 15840 gacaacccag gcttatgagggtggctgcct cgtcggggcg gcggaaaatg atacctgagg 15900 tcgagtggtg ccgggagctctgggggaaca gcccgagtag ggggacctcg ccagccccca 15960 gcacggccgg gacactggaaaggccctgct aggaacaagg gctgcaccct agatgcgcct 16020 caaatccaga actgtgcccccgagacccac aggagggaca tgtagaggct gcattctgac 16080 aagttagggg ctcctgggaaacagccgtga ctttgcgcag ctgtgcctgg ccgaagccca 16140 cgggttcaag gagcttcgcctggacttgga tttcctgctg gaggggctgt cactctcgct 16200 gtcagctcag gctcccgtgacaaaataccc cagactgggc accttaaaca acagaactgt 16260 atttcctcac cgttctggagggtggaagtc caagattaag gtgccggaaa gtgcagtttc 16320 tggtgagggc ttcctgcccattcgcagaag gcagcccctc gctgtgtcct cacctggcag 16380 ggagcgctct ctgggtgcctctttttctcc taataaggat atcagcctgt taaattaggg 16440 ccactccctt tgacctcacttaatctcgat tgtctcctaa gagtcccatc tcctgaaaca 16500 gtgaccatgg gatctggggcttcaacgcag gaattttggg agggcagaat tcatcccaga 16560 acaaaagtca gtgctgtttagagggtgaca gtagatgcag ggctgccgtc atcaacaacg 16620 tcctgcccaa aagtcactggacgtccaaag acacagaaaa gtatgaccta tgttaaaaaa 16680 aaaaagaaaa aaaaagttactaaaaactga caccgagaag acccaaagtt gtattcagaa 16740 gataaaggtc taaaacagccattataagta gttaaaagaa aagcaagata tactctaagg 16800 aaagaaatag tgacacaatagcattaaagg actaactaat tgataaagaa ttaaggagtg 16860 aacgtaggag atcagatcaggaaatcaaga gagaaatgga aactataaaa aagaaaaaat 16920 tatggaactg aaaagtacaataaatgaaaa atttactgga aagactaata ggagttggcc 16980 atggcaaaaa taatgtccatgaatttgagc acaaatcaat gtaaattatc cagtctgagg 17040 aacagagaag aaaatggttgatagcacctc aaagatatgt ggggcaagat aaagaatgta 17100 aactagcctg agaaagagaagagagtgaga atacatgaag aaacagtggc caaaataacc 17160 caaacttgat ggaaacatcaacttagccca gaggctcaga gaaacccagg caggatgaaa 17220 aagaaagcct cacaaaaatcacacctgggg acattatagt caaagcactg ccaaccaatg 17280 atttggataa aaatttaaagggatctgtgg attaaagaca gggcacatcc aggtttggtg 17340 gctcacgtct gtaatcccagcactttggga ggctgaggca ggaagatcac ttgaggtcag 17400 aagtttgaga ccagcttaggcaacatagtg agactttgtc tctctttaag aacaaacaaa 17460 aaagacaggg catgatatagtttggctctg tgtccccacc caaatctcat gtggcattat 17520 tgcaatgccc aatgttggaggtagggcttg gttggaggtg attggatcat ggaggcagat 17580 ttccccttgc tggtcttgtgacagtgagtg agttctcagg agatctggtt gtttaaaagt 17640 gtgtagcact tgccctttcattctctctcc tgctggccat gtaaagatgt acctgcttcc 17700 ccttcacctt cagctatgattgaaagtttc ctgaggcctc cacagctgtg attcctgtaa 17760 agcctgtgga accatgagtcaagtaaactt cttttcttca taaattaccc agtctcaggt 17820 agttcttcat agcaatgtgagaatggacta atacagggca taaacagggg gacacaatgg 17880 ctgtcttctc atgataaatggaaaccagga gatgacaata ggccatcttt acattgatga 17940 aaaaaaaaat caacgcagagctctttatct aggacaaata tccttcaaaa ctgagggtca 18000 aataaacata cttttcaacaaatggaagtt gagggaattt atctctagca cagcttcagt 18060 acaataaata aatgccaaggatgttcctta ggcagaagga aatgactctg gatggccact 18120 caggtctgca ggaagaagatccgtgtgaca agtggggacc aagtgagccc actggaacct 18180 gcattccttg ccctccatttggttacttaa ttatatcagc atggactcag cagttcctgt 18240 tttcttccat aggttacaatttatgacaat catttctacc caatgggagt gccttcaaag 18300 tggcttcaga gtgttgttggtacattccct gattccttga gcacattttt gctttttggc 18360 acaaaaaaca ttccaagttcctcttgcatt ttctgtacac aggctttgaa taattcacca 18420 gagtcctggt tccttttcatatagaatggt gtgtggagaa accaagctct gagtgttggg 18480 tgtgctttta cttttactgggtccgattgt atctaaacct tctcaacaca cagagcctgg 18540 actttccttc ctcctcttcctccctccttc ttcttcccct ctcctgcctc cctctctctc 18600 tctccttctt cccttcccctcccatttctc cctccctctc tccctccttc ctctcctccc 18660 tgccctccct ccctcccttccctccttcct cccctcctcc ctctctcctt ccctccctgc 18720 ctctccccat gtgtctaccaaaattggagt tcacatggat atttccaatt ctaatcagac 18780 gccacagggt tcactttgatctctcccctt tctatatgtc tatatttata attctctttt 18840 ccaacacaga gaaacccattatccttgata tatttactca tttggtccct tctagagcac 18900 acggaaagta gtatcagaattgctcaccca aaccactgtt agtcatgaac ctgcccatga 18960 gagcttagga tgtgtttacagctctttctg ctgttagctt gagaatacag agaaaaacta 19020 ctgtgttcag aagttaccaggggcagtcct ttccccttca gtgtggctgt catttatcta 19080 taatgcaact atatcccatgtttggaattt gtttatttta cttttgagta taacagtata 19140 acataccaaa ttactttgattccaaagtca aaactataca aaaacgttta ctcaaaaaag 19200 tctaacttca tccccccaccctgttccagc tccccacccc agcagggaac tagtctcatt 19260 agactctgag gtttatcctttgtttttgtt tgtttgtttg tttgtgtgtt tgtttgtttt 19320 tttgagacgg agtcttactcactctgtcac ccaggctgga gtgcagtggc gcaatctcgg 19380 ctcactgcaa cctctgcctcccgggttcaa gcgattctcc tgcctccgcc tcccgagtag 19440 ctgggattac aggcttgcgccaccaccacg cccggctaat ttttgtactt ttactagaga 19500 tggggtttca ctaagttggccaggctggtc tcgaactcct gacctcaggt gatctgcccg 19560 cctcggcctc ccaaagtgctgggattatag gcatgagcca ccacacccgg ctgatgtagc 19620 gtgctgtttt gtggctgcctcgagaaccag gctccccgca gcccttaggg acactagtgg 19680 ggctggcagg tgtttccttctcagaggcct ggcccagctg ctcggcttcc tcctttttgt 19740 atttttattt tattttagagatggggtctt gctccatcac ccaggctgga gaggagtggc 19800 acgatcacag ctcactgcagcctcaacctc ctgggctcaa gcaatcctcc tacctcagct 19860 tcctgagtag ctgggactacaggagcacgc caccatgtcc agctaatttt taaacttgct 19920 tgtagagacg aagtcttgctatgtcgccca ggctggtctc aaactcctgg actcaagtga 19980 tcctcccgcc tcagcctcctgaagtgctgg gattaccgcg agcagccacg gctccctcct 20040 gaaggctcca ggcctaagtgtcccgctccc ttgttctgtg gctcagcccc tgggcaactc 20100 ctgcttgcca tggctgtggctggcacctcc gtggccccac tgcccggtca gtcttcgcag 20160 gcccagcccc cagtagctgtgtggtttctg ccttcctcct gcactctggt ggtcacaggc 20220 cctgctgcgt gggtacagggcaggtggcca gggctgcagc ataagctctg tgagtgcagg 20280 cggccctagg cactggagccttgttgcatg gacgtgctcc ttgcaccctt gttgatgaat 20340 gagcgaaggt gtgtgggagtgaatcccaga gggtgcaggc cacggcactg ggggaggacc 20400 ctcccggctg cctcccagtgggcttcaggg gctcagatag accctgccga gcacctctgt 20460 cctctgctcc cggactggggctgcagagag ctctcaaggg gccttgggag gcacaagcag 20520 gccagggtcc cagggaggggaggcaggagg cagcagggcc ggatggtggg ctgcagaggt 20580 ggaaggaaca ggacagtctgttctgggggt gctggctgtg ggggacagag gccaggccac 20640 aaagcagaat ggtgggtgggccaggaggag gtgggagcct ttttcaaaag gctttcaggg 20700 actttttata tttaatgattgaaacaaaag atatccaaaa ccgcagttcc tgagaaccac 20760 ttgttctcgc cctggttcctctccttgggc ttgaccttgg cctcctgccg ggcctggcag 20820 cggacagcag tctccctgcttcccaggctg ggcgagggag agcgtgggtc cccaggaacg 20880 ggggaggtgg tgcggctggctttaggatgc ccattctgga accttctggt ggaagaaagc 20940 aggggatcag tatagtatggttgatccaca cactccagac cctctcctgg ctgtgcagct 21000 ggtctgatct gggctcagaggctgccactg ctcagtgctc ccctgggtaa ttttgcagga 21060 ggccaggagg gttcctggcagttagtgcca cccacacgga gaaattcaga gccatataaa 21120 cgggtctcca gggcctggagggactgcaca tcctgggctt gcggcgcagt gtagacctgg 21180 gaggatgggc ggcctgctgctggctgcttt tctggctttg gtctcggtgc ccagggccca 21240 ggccgtgtgg ttgggaagactggaccctga gcaggtacag tcctcctggg ggtggggaga 21300 gctggtcctc gggggccagcccctccttta aggccacaca gcttctgggc cccccagggc 21360 tagcccagac cagcatgagcagtggggaat tagttgggcc agcccttggg gagtcccaca 21420 ggcaggagcc tcagggcaggaggggtggat gctggagggt tggaggctgg agggctggag 21480 aattggaggc tggagactggaggttgcagg gctggagggt gtagggttgg agaactggag 21540 gctggagggc tggaggctgtagggctggag aactggaggc tggaggctgc tgggctggag 21600 gctggagggc tgtagaactggaggctggag ggctggaggc tggagggctg cagaactgga 21660 ggctggaggg ctggagagctggagactaga ggctgcaggg ctgcaggctg gaggttggaa 21720 ggctggaggg ctggagcctagaggctgtag ggctggaagg ctggaggctg gagggttgga 21780 ggtctggggg gttggaagctggaggctgga gggctggaga actgaaggct agaggctgca 21840 aggctggaag gttggaggctggagggcggg agggttggag actggaggct ggagggctgg 21900 atggttggag ggtggaggctggaggctgca gggctggaag gctggaggct ggagggttgg 21960 aggctggagg ctggagggctggatggttgg agggtggagg ctgaggctgg aggtttggag 22020 gctggaggct gaggctggagggttggaggc tggaggctgg aggctggagg gctggcggcc 22080 ctaacgggag ccgcctcaatgcagcttctt gggccctggt acgtgcttgc ggtggcctcc 22140 cgggaaaagg gctttgccatggagaaggac atgaagaacg tcgtgggggt ggtggtgacc 22200 ctcactccag aaaacaacctgcggacgctg tcctctcagc acgggtgagt gggcgggtcc 22260 tgccaggcct tcccgcaggcaggactgtgg ctcagccaca tcatgtactt tgcacatctg 22320 ctcccgggca ccgcggcctgggggcttccc acggcccctc tgcacccccg tcacttttcc 22380 cactgggctc tgtgcagagccaccagccct ctgcccagct actcacatct gtgtccctgg 22440 ggcctccagg tggcactcccaccttcaata ccgcggcggg cactgctctg actatgctcc 22500 ttcatgcatc cctgtcctgggtcatctggg ctgcaggctg tggtcagagc tggggaagcc 22560 tgtcctctca gcaggcctttttcgggaatt ctgtttaaag gaggataatg cattcggacc 22620 taagaaccac tggttctttcataacataaa tccccaaaca atatggtcat ggacgctgtt 22680 ggccagactg gccttgtggcctcaggcagg cgcacccgac cacatgcatg gctggtggcg 22740 gcgtgcaggg gtcgggtgggccaggctgtg gggcggccgt gctacacagg ggacacctac 22800 agggcagccg ggctgtgggggtagccgtgc tggggcagga gtcggtgccc gcctcccttc 22860 acagaagagt ggcctgctgggtggcgtggt gggtgtctac cccctggcag gctggcgaca 22920 gagcccaggc ctgaaccactgagccaggta cggggcccag agagaagaac ctgccctcct 22980 caccagcgtg ccttccccctgcacctgcgc ctgcagggta tgggcagggg ccccactgca 23040 ccccccgggc caggcctggcctcagggcga gctgggacca aggggctcca ggctgaggtg 23100 gcagctccca cgtggctttcctagttcttc ttgctgttcc tcccgcatgg actcagcctc 23160 agtttccttc cttggcaaacaccattgcac ctttgctggg ttttgctcac tctggggact 23220 gaccacttag ggagcgccggccccttcccc cgagtcgcag gcaaaccccc accccgcccc 23280 atccttcccc tgagtcgcaggcgaaccccc accccacccc atccttcccc tgagtcacgg 23340 gcgaaccccc accccgccccatccttcccc tgagttgcgg gcgaaccccc accccgcccc 23400 atccttcccc tgagtcacgggcgaaccccc accccacccc atctttcccc cgagtcgcgg 23460 gcgaacctcc cgtctgtcgagtctcctgtt gtgtcctcag cagcattcag cgttcctggt 23520 ggagatctgg cactgaggatctctcaggaa gtgaatggga gtgtccgccc ctctcaggac 23580 tcccacgccg tccactctccgagaacaggt ttccgggcat cggggcatct ggcagaagat 23640 ggaagctctg gcctcactgctggctggcct ggcctggact cctgagctcc gtctcctctc 23700 accgggcccc ggggtcttgaccctgagtgg gtgacaggcc ccttcttttc caggctggga 23760 gggtgtgacc agagtgtcatggacctgata aagcgaaact ccggatgggt gtttgagaat 23820 ccctgtgagt ctgacggccacggcctcacc caggccgttc ttccctgcag catccccagg 23880 ccccctgcgg gcagggacggagggctgact ccgctcccga gagccaggag gctacaggtt 23940 ctggtcttgg gattgctgcccatcctgagg gtgaagagaa gcccctggac agaggattgg 24000 aaccttgttc ccagcaggctcagacaccag tgaggtccag ccaaggtccc cataggtgca 24060 gagggcgagg ggcctgtttagtggctgtct acaggtacag ggctgggctg gtgtgaccca 24120 gccgctcctc gttcacacctgggacaggct ggggcagtgc tcgtctgtct gtccgtcagt 24180 gtctgaaagc cccttgatgctggcaccagg tggggtcggg ggtgggggac aaagcttggg 24240 ggctctgggt gtgcttggaggggtctcctg gggagggggg ctctcacttg aattggcccc 24300 tctgtccctg tggccagcatcagctcagcg gcccctggat ggatacagca gctgccctcc 24360 cctgaggcgg ggggttttgtttcctagcaa taggcgtgct ggagctctgg gtgctggcca 24420 ccaacttcag agactatgccatcatcttca ctcagctgga gttcggggac gagcccttca 24480 acaccgtgga gctgtacagtaagtgtgctg tgcggggccc acgtagggag ctgtgtggtg 24540 cgtgggccgc tcggggcccacgttgggagc tgtgtggtgc gtgggccgct tgggggccgc 24600 gtggggagct gtgtggtgagtgggccgctc gggggccatg tggggagctg tgtggtgagt 24660 gggccgcttg gggccacatggggagctgtg tggtgagtgg gccgctcggg ggccatgtgg 24720 ggagctgtgt ggtgggtgggccgcacagga ccacatgcag gccaagttcc cccaccttga 24780 ggctcgctgc actggaggtggctttggaag gaccggggcc agcatcccag ggcaggtggc 24840 ctccttcctg acccttcccagcccctcccc ggcccctccc cagctctcag aggttgcttc 24900 cccctgcact gccctggtgcccccaggtct gacggagaca gccagccagg aggccatggg 24960 gctcttcacc aagtggagcaggagcctggg cttcctgtca cagtagcagg cccagctgca 25020 gaaggaccgt gagtgtccaccggagcagcc tcgggggagg cttggggcaa gttctggagc 25080 ccacagggcg tgatggggtgacagagaccc tggggttgat tctggcttct gcctgactca 25140 gtcgccctga aggagtcaggcaggctgagg gggtcttcac ctgcccctgc ccaaagtcag 25200 gcctgggggt cttgtccccacggcaggcga gtcttgtggg tggggccctg gggtgctggg 25260 gggctcagtc ctttttcacagccagtcttg cctccccttc tagtcacctg tgctcacaag 25320 atccttctgg taagccgccatcctgagcct caccctgggc tctcttgggg gaagggggtt 25380 ggggaggcca ccctacgcacagtgggtggg aggaccctct gcccaagccc acggggttca 25440 tggctccact gtctctaaggcagctgaggg tgtggaggag cctccatggt gggctgggcc 25500 ccctcacact cctcttggttttcagtgagt gctgcgtccc cagtagggat ggcgcccaca 25560 gggtcctgtg acctcggccagtgtccaccc acctcgctca gcggctcccg gggcccagca 25620 ccagctcaga ataaagcgattccacagcaa accaaggatg cttttgactg ggggccagcc 25680 ggggaattgc ggggaggatggcgggggtcg tcaccaaggg ccaagccaca gaaccataga 25740 gccagctgca aagaagacgcgtgcccaacc cagtcccttg ccagagcccc tctgggtctt 25800 cagacaccca aactcaggcccgggctcaga ggcgggggct ggtcagagga cacagcctgg 25860 gaggaagcca gcagtgagctccaccaggac cgcagctggg gtgccagagg ctccaaggac 25920 tgcccaggac aagagagaggtggctggtgg ggctcagagc agcccctgac tccccaggct 25980 caggcttctc tgggctggggttgtgtctct tgcctgggca ggggcaggga ccccagacac 26040 ctgagcctgc aggtcctcctgtgttctgat gggtgtgggg ggctgcgctg cctgccgtcc 26100 tgggagaagc acaggccatgggggtctcgg ggctggatgg ggggctgggg actgtgtgag 26160 gctcatgctg gagtgaggcgaggtgcacac agatgccttg acctgcaagg ctgggcatgc 26220 acacgcatgc acatgtgagcacacacccac acatgcacac acacacattc atgggcacgg 26280 gcacacagcc ccggcccgcccttcggcacc tgctcaggtg cttggtgcca ggccctcaag 26340 aaggtggtgg gggctgtgtgtgtggctgcc gcgcagagct caccatgtcc acatctgccc 26400 cagaagggcc acgaccagccctttgcctgt cacctgggag gtctaggatg agggccaggg 26460 cttaatgagt ctcaggggcacacacagtga ccccaggccg ggcactgagg gcaggagggt 26520 cctggggcct cagaggggcctctgtctgcc agggcagccc aggccagcag tgaccaccag 26580 gtggggcagt gccggcttccctggagtgtc tgggggtggt ccctgggcgg ggagagggta 26640 gcagagccca aggccacacctgttgcctcc ctccgctccc tcccctccct cctctcccga 26700 cacaggctgg cggctccagagaggtttaaa cactggcctt ggcggctgag ggaggagggt 26760 gaagatgagg caggggctgctggtgctggc gctggtgctg gtgctggtgc tagtgctggc 26820 tgcagggtcc caggtgcaggagtggtaccc cagggagtcc cacgccctca actggaacaa 26880 ggtgagccat gtgccactggtccctgagcc gggcgccggc tctggagctg gagcaggctc 26940 ctgcacagtc tcccggcagccgcccaggag cagggctacc agcagttcca ggcacgggcg 27000 ctgcccacct ggcctctcccactgcagcct ccagcacagg ggccccgcaa gttgcagggc 27060 aaaggcaggg agcctcgagggggttcgggg gcccaggggg gtcagggagg gccagggtgg 27120 aggggcccag agagctgtggccgaggaggt cgggagacgc ccagagtgca gagtgggtgt 27180 ggagcaagcg gggcatattctgtagcggga gggcccgggc acctgctgag tgagccgtgg 27240 gagcagcact cattggcacggcggtgccca gggcgctagg gttggcagca ggcagagttg 27300 ggggccgggg gccggcaggaagggaagtgg ccgagcagta cccagccccc aggaggccat 27360 gtgggaagga gccggggcctctgaggtagg gggcctccca cctgctggcg cccacagccc 27420 agccctttgc tgcctggggtccctgaccag ccgggcacct gcttctctgc agccctgctc 27480 cagtggccag ggcttgagtggccgggaccg aggcggcacc tgggcggagc tgggactccg 27540 gctcacagaa gggtcatttgtctcctgggt tctctagctg ggaacctcgg gggtggggac 27600 agcccccggt ggcagctcctgtccccgccc acacatccgc tgcgcagttt tcagggttct 27660 ggtacattct ggccactgccactgatgccc agggattctt gccggccagg gacaagagga 27720 agctgggggc gtccgtggtaaaggtgaaca aagtgggcca gctccgcgtg ctcctcgcct 27780 tcagacggtg agtgggcagcgccccccagg caagggtctc cccttccagg aggagggagc 27840 tgcctcgggg ctggaaaggagtctgggagg agcctgtctg tctggaagtt ccacaagaac 27900 cgtcccccca cttcctccccctaaactcag aggaagccac ctgcagcccc tgggacccca 27960 catgggtggg tgttgggaggaagctgcctg cagaccttgg gaccccacat gggtaggtgt 28020 tgggaggaag ccgcttgcagctcctgggcc ccacacgggt gggtgttggg agtttctctc 28080 gtgccgtggg acaggccccagtccgaattg gaaaatcctc aaaccacaga gcctcagatt 28140 gacacgagtt ggatagtgaagcaaaaaact ggggacctgg ggccccaacc ttcccgggtc 28200 ccacgggagc aggggcgggaccgagacgtg tgggtggggc tcggtggctg gcaggcggga 28260 ggaggggttc cctggtctgcagcctgagcc cccgccttgg ccccaggttg aaggggtgcc 28320 agtcccagga ggtgatcctgaggaaagacg ggaagaagcc ggtgtttggg aacgcctgtg 28380 catacgcggc gggcccgagggaaggacagg agggaggtgt gcttgggctc tgggccctgg 28440 aggagctgcg tgatatgcctctgacccccc tgtgccttga gcaggcccct tgtccccgag 28500 agcacgtggc aggcaggcgtcaggctggct gtgttcccag agcccacacg tcatgtgctg 28560 ggcggcctgg cccgggaggggaggccccgg gagggcactc agtccctctg gacacctgca 28620 cctgggtggg ggctctgaggcctcgggcac caggggtcag ggctgtgggc gggcacagcc 28680 acgccatccg ggaaccagcgggcatctctg ggcatctctt tcagtgaaag gggtgaaggc 28740 cttccacgtg ctgtccactgactacagcta cggcttggtc tacctccgcc tggggcgtgc 28800 aacccaaaac tacaagaacctgctgctctt ccgtgagcag gcacagcgga tggcagagga 28860 tggtggaggg ggtgcagagggaggcataga ggggcgctgg gcacaggggg ccgggcagga 28920 ggtgaggcgg gcagggagagtgggggttgg ggcggtggtc agggcagggg caatggaagc 28980 cgagaccaca gcagtcaggtcctggaggtc gctgaagcaa ggatagggga cgtcaggaag 29040 aggctgcaga gaaaggtgggccatccaggg ggagcggagg ccaccaggat ggtggcgggt 29100 gcaggtgcaa cgttctgggtgagggacaca gtggggatga ggaggcgtgg gaggggtccc 29160 aggtaagggt ctgcccgggatggaggggtg tggaagtgac ccagtgtggg ttaagggcca 29220 caaacccaat tctagacaccaaccacttgg ggtacacgga aaggagtggc cctcccggtt 29280 ggaacgggta gaggggaggtggggtcctgg ggtctcagtc gaggcagggc acgctcaggg 29340 tgcagcaagc tggggcacagctgaggggct ggctggaggg gactctgcag gcagagaggg 29400 ttccccaggt gacagcgtggtgaccctggc ttcccgagcc tggctgcctc gtaagtgggg 29460 aggggcccag cggccgctgtcctcccctgg ccaacccctg cttcacgcct aagctctgag 29520 gacatttcag cagctcctggggtagggatg ggggtcaggg tgcagaccct aaaatggcca 29580 gagctggagt ctctgatgtgatcctgatct caacctcaga taggcagaat gtttcgagct 29640 tccagagtct gaaggaattcatggacgctt gtgacattct ggggctctcc aaggccgccg 29700 tcatcctccc gaaagacggtaagctgtgcc ctcagccctt tgccctcctg gccctacctg 29760 cccgtcccat tggggagctgattcattggg ggaaggagag atgaaaggat ccatttgaaa 29820 ggtgccattt gaaagatccgcccaaacgtg cctcccacct cctgccgccc ctgagaatcg 29880 gggcttggcc tggcaggcctggcaccgaca gcgagaggcg tccggagcag atgctgcatc 29940 cgaccaggcg atgccgtggctgtggctgca ggatcggggg ctgcatggct gtgttctcgt 30000 ggtgcccaag acgagcacgcatggaaccag ggagcggccg ggggagacgt gttcacggct 30060 catgtgtctc ctccaagcagccagcccaca tgtggctgtt ttcctctccc acagcgtccc 30120 gtacacacac catcctgccatgagagccag cgtcctctgc tcttcccttt cgacgcgtgg 30180 tcctcccaga cccgaggagcgttgttccag atgtcaagtg gagaccacgg cttgttgaga 30240 ggtgtttggt gactctaggtcattgacagt gtgaaaatca agtggctcaa gggatccttt 30300 atgatcacag aaagatgggggaaggaggat acaggctgac cgggtggcga gatgtcagcc 30360 aggccccttc cccactgtcttctggagcac actgcaggct gctgctattc cctgtctgct 30420 gtgaaacaga ccacaggccggcccaacgca gtcctcgccg tgtgccttgc ctctgccctc 30480 cacgccgccg ccaccctccgcgttccttta catcctgcac tttctcagga agccgttcgc 30540 atatccaccc aacatggaggagcccccatg acgtgccagg cagcgtgctg gcccctgcgg 30600 ctgcgttagg gaacagaatcggcaaacgca gtcccagagt tgatgcctga tcaggagaag 30660 ccaacttcag cagtgtggtcacctggatac tcgagtatag accctagatc acctggatac 30720 tcgagtatag accctagaagtgtatctagg cgcggccagg gccccacaac agcaggtctg 30780 actcattcca tccagggcagcatcagggag agcttcctgg gggaagtggc atttgggctg 30840 acctgggaag ggtggtgggtgttaa 30865 3 531 DNA Homo sapiens CDS (1)..(531) 3 atg atg tca ttc ctgctc ggc gca atc ctg acc ctg ctc tgg gcg ccc 48 Met Met Ser Phe Leu LeuGly Ala Ile Leu Thr Leu Leu Trp Ala Pro 1 5 10 15 acg gct cag gct gaggtt ctg ctg cag cct gac ttc aat gct gaa aag 96 Thr Ala Gln Ala Glu ValLeu Leu Gln Pro Asp Phe Asn Ala Glu Lys 20 25 30 att gga gga ttc tgg agggaa gtc ggt gtg gcc tcc gat caa agc ctg 144 Ile Gly Gly Phe Trp Arg GluVal Gly Val Ala Ser Asp Gln Ser Leu 35 40 45 gtg ctg acg gcc ccg aag cgggtg gag ggc ttg ttc ctc acc ttg agc 192 Val Leu Thr Ala Pro Lys Arg ValGlu Gly Leu Phe Leu Thr Leu Ser 50 55 60 ggg agt aac ctg acc gtg aag gttgca tat aac agc tca gga agc tgt 240 Gly Ser Asn Leu Thr Val Lys Val AlaTyr Asn Ser Ser Gly Ser Cys 65 70 75 80 gag ata gag aag atc gtg ggc tcagaa ata gac agt acg gga aaa ttc 288 Glu Ile Glu Lys Ile Val Gly Ser GluIle Asp Ser Thr Gly Lys Phe 85 90 95 gct ttt cct ggc cac aga gag atc cacgtg ctg gac acc gac tac gag 336 Ala Phe Pro Gly His Arg Glu Ile His ValLeu Asp Thr Asp Tyr Glu 100 105 110 ggc tac gcc atc ctg cgg gtg tcc ctgatg tgg cgg ggc agg aac ttt 384 Gly Tyr Ala Ile Leu Arg Val Ser Leu MetTrp Arg Gly Arg Asn Phe 115 120 125 cgc gtc ctc aag tac ttt act cgg agcctt gag gac aag gac cgg ctg 432 Arg Val Leu Lys Tyr Phe Thr Arg Ser LeuGlu Asp Lys Asp Arg Leu 130 135 140 ggg ttc tgg aag ttt cgg gag ctg acagca gac act ggt ctc tac ctg 480 Gly Phe Trp Lys Phe Arg Glu Leu Thr AlaAsp Thr Gly Leu Tyr Leu 145 150 155 160 gcg gcc cgg cct ggg cgg tgt gccgag ctc ctg aag gag gag ctg att 528 Ala Ala Arg Pro Gly Arg Cys Ala GluLeu Leu Lys Glu Glu Leu Ile 165 170 175 taa 531 4 176 PRT Homo sapiens 4Met Met Ser Phe Leu Leu Gly Ala Ile Leu Thr Leu Leu Trp Ala Pro 1 5 1015 Thr Ala Gln Ala Glu Val Leu Leu Gln Pro Asp Phe Asn Ala Glu Lys 20 2530 Ile Gly Gly Phe Trp Arg Glu Val Gly Val Ala Ser Asp Gln Ser Leu 35 4045 Val Leu Thr Ala Pro Lys Arg Val Glu Gly Leu Phe Leu Thr Leu Ser 50 5560 Gly Ser Asn Leu Thr Val Lys Val Ala Tyr Asn Ser Ser Gly Ser Cys 65 7075 80 Glu Ile Glu Lys Ile Val Gly Ser Glu Ile Asp Ser Thr Gly Lys Phe 8590 95 Ala Phe Pro Gly His Arg Glu Ile His Val Leu Asp Thr Asp Tyr Glu100 105 110 Gly Tyr Ala Ile Leu Arg Val Ser Leu Met Trp Arg Gly Arg AsnPhe 115 120 125 Arg Val Leu Lys Tyr Phe Thr Arg Ser Leu Glu Asp Lys AspArg Leu 130 135 140 Gly Phe Trp Lys Phe Arg Glu Leu Thr Ala Asp Thr GlyLeu Tyr Leu 145 150 155 160 Ala Ala Arg Pro Gly Arg Cys Ala Glu Leu LeuLys Glu Glu Leu Ile 165 170 175 5 5159 DNA Homo sapiens 5′UTR(1)..(5159) 5 tgtttatttt tattattttt tttttttaaa tttatttaac ataatttttattaataataa 60 taataaatat agtttgtaaa ttttaaaaat aaaagagaag agataaagaatagattaaat 120 gaattatata aatgaattaa ggtaatactc aatcctttgg atatgtgggattatacacca 180 tataattttg ttatatttaa tattagtata tatggtaggg tgtatgtgagaaaaggtaaa 240 acttaaaaaa gcggtgaatt acaagatcgg taacgaaggt cgaaaagaaaggaaatagcg 300 ggtgtctggt gatgtcttgg atttgagggt cgggggtttc gaactctttttaaggtccat 360 tatttaggcg cggcggggag agaggtatgt aaaaccgcaa gatggggtccttatcaggat 420 aaggaccacg agtgactgtt agcagaaagt aatatcatga ggaaacagtaaaaaaaatac 480 ttatgagttg ggttgggagg ggactgagct ctgagaaagg accccagttcactatgtcaa 540 aaacgaaaaa aaggcaaggc gcagtggctc acgcctgtaa tcccggcactttgggaggcc 600 aaggagggca gatcacctga ggtcaggagt ttgagaccag cctgaccaacatggtgaaat 660 cctgtctcta ctaaaaatgc aaaaaaaatt agacgggcat ggtggctactgttatcccag 720 ctacatggga ggctgaggca ggagaatcgc ttgaacccgg gaggtggagattgcagtgag 780 ctgagattgc accactgcac tccagcctga gtgacagagt gagaccccgtcttaaaaaac 840 aaaacaaggc caggcacggt ggctcatgcc tgcaatccca gcactctgggaggctgaggc 900 aggtggatca caaggtcagg agatcgagac catcctggct aacatggtgaaaccctgtct 960 ctactaaaaa tacaaagaat tatcccagca tggtggtgga cgcctgtggtcccagctact 1020 ccggaggctg aggtgggaga atggcgtgaa cccaggaggt ggagcttgcagtgagctgag 1080 attgcgccac agcactccag cctgggcgac agagcgagac tccatctcaaaaaaagaaaa 1140 aaaaaaaaaa aaaaaaaaga aaaaaaagaa aaaagaaaaa aagaaaagaaaacgaagaaa 1200 aaccacaaat aaataaaagt tgaggtatga tgtatgcaca gctgcattcgcccctttgag 1260 ggcacagttg tgtgagttta gacaaatgca aatgttcatg gaaccactgcaagtgcgaca 1320 cagcagctct ggtgcccaca aagcttcctc gaggtccctt gcggtcaatcactgcccgca 1380 cctcagccct cagagccact gggaagcccc ctgccctgtg ggctcacctgtcctagaacg 1440 tcgtataaaa ggagtcgcag atcgcagcct tttccacctg gcttctgtccctctgcaggc 1500 tgcacctggg attcagccac cttcgttcct tttactgctg agtgtccatcacgtgctgct 1560 catgcagaag tgctcaagtt cgctgtaaac gtagttttca tttctcccgggtggagacct 1620 ccataagtag gtaggttggt ggaatgtttc actgcatgag agtctgccaggaagttttgc 1680 agtgtggtcc cgtcttgttc ttcgcgcaga gggtgatagc tctgctgactcccagccctc 1740 acaggctcta gggattgagc ctgtttctgt ggaacgcctt ccatggtctgcgaggtgcac 1800 agaggggtct caaaaagctt ccaggttgca tcttcccatg gactgatgctgagcgtgttc 1860 tcatctccca tccggatgtt ttctttggtg gagtatcttt tcagatctcttgccttttta 1920 tttttatttg tttatatata tattttgaga cggaattttg ctctttcacttaggttggag 1980 tgcaacggcg cgatcttggc tcactgcaac ctccccccac ccacctcactttgggttcaa 2040 gcgattctcc tgcctcagcc tcccgagtag ctgggactac aggcgcccgccaccatgccc 2100 ggctaatttt tgtattttga gaagagatag ggtttcacca tgttggccaggctggtctcg 2160 aactcctgac ctcaggtgat ccacccgcct cggcctccca aaatgcgaggatcacaggcg 2220 tgagatgcca cacctggcct ctttttattt ttaatggagt ggcttgttttcttattactc 2280 agtcttggga gttcctcatg gatgctggat ttcagtctcg tctcaggtgtatgcgttttg 2340 tagatgtttt ctgtagggac cagccccaca gggttggtga gtttctccctgtgtgctgag 2400 atgagagggc atagaaataa ggacacaaga aaaagacata aaagaaaagacagctgggcc 2460 tggggaccac taccaccaag acgtggagac cggtagtggc cccgaatgtctggctgtgct 2520 gatatttatt ggatacaaag caaaagggac agggtaaaga gtgtgagtcatctccaatga 2580 taggtaaggt gacgtgagtc acgtgtccac cggatggggg gcccttccctgtttggcagc 2640 caaggcggag agagagagag agagacagct tacatcatta tttctgcatatcagagactt 2700 ttagtacttt cactaattga ctactgctat ctagaaggca gagccaggtgtacaggatgg 2760 aacatgaagg cggactacga gcgtgaccac tgaagcacag catcacagggagacggttag 2820 gcctctggat aactgcgggc aggtctgact gatgtcaggc cctccacaggaggtggagga 2880 gtagagtcct ctctaagctc ccccggggga aagggagact ccctttccctgtctgctaag 2940 tagcaggtgt tttcccttga cactgacgct actgctagat cacggtccgcttggcaaccg 3000 gtgtcttccc agacgctggc gtcaccacta gaccaaggag ccctctggtggccctgtccg 3060 gcgtaacaga aggctcgcac tcttgtcttc tggtcacttc tcactatgtcccctcagctc 3120 ctatctctgt atggcctggt ttttcctagg ttataattgt agagcaaggattattataat 3180 attggaataa agagtaattg ctacaagcta atgattaata atattcatatataatcatgt 3240 ctatgatcta gatctagtat aactcttgtt attttatata ttttattaaactggaacagc 3300 tcgtgccctc ggtctcttgc ttcggcatct gggtggcttg ctgaccacagttttcctgct 3360 gtatctaact gctggcttag gtgaacctgt caggggcgtg gctgggaactggaacctcct 3420 gaagaccagt ggaggccaca agcagaagcc ctcggctcct tcccttccccgatgctgggg 3480 gccaagggtc tttctctgac cctgtccccc cctcacgtta ccagtggagggtgtccaggt 3540 tcttggcatt ttgaacaaag aattggacaa aacacacaaa caaagcaaggaaagaatgaa 3600 gcaacaaaag cagaaattta ttgaaaatga aagcactctt tacagggtgggagcgggctg 3660 agcaagcagc tcaagggccc cggttacaga attttctggt gtttcaatatcctctagcgg 3720 tttaccattg gttacttggt gtatgcccta tgtaaatgaa gaggatgaagtcaaagtcat 3780 tttctcggct gagcatactg tcacgcctgt aatcccagta ctttgggaggctgaggtggg 3840 tgtatcacct gaggtcggga gttcaagacc agcctgatga acatggagaaaccccgtctc 3900 tactaaaaat acaaaattag ctgggtgtgg tggcgggtgc ctgtaatcccagatagtcag 3960 gaggctgagg caggaaaatc gcttaaaccc gggaggcaga ggctgcagagagccaaggag 4020 tgcaccactg cactccagcc tgggcaacaa gagcaaaact ccatctcaaaaaacaaacaa 4080 acaaaacgaa gtcatgtact cggtatgagc cctacgtaaa cggagaggatggagtgaagg 4140 tacaaagcca ttcacattcc cgtcatcgtt agtgtttcca gttgatttggttctaggatg 4200 cccttgggct ccctgcgtcc aggccctctt ctcctgcctc actctcactccagaggactg 4260 gcagtgctgg cctccctggc aggactcccg cccctgacca gaggggtctgggcgtgctcc 4320 gaggccacct gctgctcctt gcccagtgtc cctgggtgct ctgtggggacacagagaggt 4380 cacttgggag gttcctgcct ctgccagact gcggcgcctt cctggcgtcattgccccctg 4440 cacgcttata caaccagtga taaggcggta tccaatgccg taatcctccccagggccagc 4500 ccagccctgt ccggtcccgt ccggtccctg gcctgcttgc tgggagcccagctgggggct 4560 gcatcggggg tggtgggggt ctggagtgag gggtgcccag gcctctctcacagtgtgggg 4620 gcacaagctc tgccctggac actctgccag tgtgacccta ccccccatgttgggggtgag 4680 agaagggggc tttctgggtg gccagatccc aggagagggg gcctagccagaccaacccca 4740 gcccagctct ctgcagctgt accccaaagg ccctggccca gcccagccagacagagtgtg 4800 gacagagggg cttgggtgcg agtccaggca aggcaggaca agtctctgctgtggcttggc 4860 agtggccccg tgactggtga catagtctgg caggcctagg ggacaaatcctcagcccggg 4920 tgcatcaggc cagagcctgg ctgtctggag ccctccagga aatgaccccccgggctgggg 4980 gacatccgag tgattgtgcc aaacaatgga caagggggcc ggagtctcagagccgaagtg 5040 ccttccgctt ccatctgctc cgcccaggag caggtgcacc caggtggtggcctgggctat 5100 aaagctggcc ccctggggct tggggactca gcaccagggg ctggagggcaggggagggg 5159 6 175 PRT Mus musculus mat_peptide (1)..() 6 Met Glu AlaArg Leu Leu Ser Asn Val Cys Gly Phe Phe Leu Val Phe 1 5 10 15 Leu LeuGln Ala Glu Ser Thr Arg Val Glu Leu Val Pro Glu Lys Ile 20 25 30 Ala GlyPhe Trp Lys Glu Val Ala Val Ala Ser Asp Gln Lys Leu Val 35 40 45 Leu LysAla Gln Arg Arg Val Glu Gly Leu Phe Leu Thr Phe Ser Gly 50 55 60 Gly AsnVal Thr Val Lys Ala Val Tyr Asn Ser Ser Gly Ser Cys Val 65 70 75 80 ThrGlu Ser Ser Leu Gly Ser Glu Arg Asp Thr Val Gly Glu Phe Ala 85 90 95 PhePro Gly Asn Arg Glu Ile His Val Leu Asp Thr Asp Tyr Glu Arg 100 105 110Tyr Thr Ile Leu Lys Leu Thr Leu Leu Trp Gln Gly Arg Asn Phe His 115 120125 Val Leu Lys Tyr Phe Thr Arg Ser Leu Glu Asn Glu Asp Glu Pro Gly 130135 140 Phe Trp Leu Phe Arg Glu Met Thr Ala Asp Gln Gly Leu Tyr Met Leu145 150 155 160 Ala Arg His Gly Arg Cys Ala Glu Leu Leu Lys Glu Gly LeuVal 165 170 175 7 27 DNA Mus musculus primer_bind (1)..(27) 7 cttctctggtacaagctcca ccctggt 27 8 24 DNA Mus musculus primer_bind (1)..(24) 8tggatggata gatgcataca tgag 24 9 20 DNA Mus musculus primer (1)..(20) 9caacggtggt atatccagtg 20 10 21 DNA Mus musculus primer (1)..(21) 10gatgtgctcc aggctaaagt t 21 11 21 DNA Mus musculus primer (1)..(21) 11agaaacggaa tgttgtggag t 21

What is claimed is:
 1. An isolated promoter region, comprising an about5.3 kb fragment (Accession No. AF082221) of mouse genomic clone 10983(Genomesystem Inc., St. Louis, Ky.) between the EcoRV and SalIrestriction sites, or functional portion thereof.
 2. An isolatedpromoter region of claim 1, the functional portion comprising a TATA boxand at least one cis-acting regulatory sequence selected from the groupconsisting of a Sp1-binding site, an AP-1 binding site, a retinoic acidreceptor binding site, an androgen receptor binding site, a C-Etsbinding site, a SRY binding site, an AP-4 binding site, a C/EBP bindingsite, and combinations thereof.
 3. The isolated promoter region of claim1, comprising: (a) the nucleotide sequence of SEQ ID NO:1; or (b) anucleic acid molecule substantially identical to SEQ ID NO:1.
 4. Theisolated promoter region of claim 1, comprising a 20 base pairnucleotide sequence identical to a contiguous 20 base pair nucleotideportion of SEQ ID NO:1.
 5. An isolated promoter region, or functionalportion thereof, comprising an about 5150 base pair region immediatelyupstream of the human hEP17 transcription start site.
 6. The isolatedpromoter region of claim 5, the functional portion comprising a TATA boxand at least one cis-acting regulatory sequence selected from the groupconsisting of a Sp-1 binding site, an AP-1 binding site, a cAMP responseelement binding protein (CREB) binding site, a SRY-related HMG-box gene5 (Sox5) binding site, a Sex-determining region Y gene product (SRY)binding site, a c-Ets binding site, a GATA binding site, an Octamertranscription factor 1 (Oct-1) binding site, and combinations thereof.7. The isolated promoter region of claim 5, comprising: (a) thenucleotide sequence of SEQ ID NO:5; or (b) a nucleic acid moleculesubstantially identical to SEQ ID NO:5.
 8. The isolated promoter regionof claim 5, comprising a 20 base pair nucleotide sequence identical to acontiguous 20 base pair nucleotide portion of SEQ ID NO:5.
 9. A chimericgene comprising the isolated promoter region of claim 1 or 5 operablylinked to a heterologous nucleotide sequence.
 10. A vector comprisingthe chimeric gene of claim
 9. 11. A host cell comprising the chimericgene of claim
 9. 12. The host cell of claim 11, wherein the cell isselected from the group consisting of a bacterial cell, a hamster cell,a mouse cell, and a human cell.
 13. A mouse comprising a mouse cell ofclaim
 12. 14. The mouse of claim 13, wherein expression of the chimericgene confers male infertility in an otherwise fertile mouse.
 15. Amethod for identifying a substance that regulates EP17 expression, themethod comprising: (a) establishing a gene expression system comprisingthe chimeric gene of claim 12, wherein the heterologous nucleotidesequence is a reporter gene, and components required for genetranscription and translation, whereby the reporter gene is expressed,and a level of reporter gene expression is assayable; (b) assaying abaseline level of reporter gene expression using the gene expressionsystem of (a) in the absence of a candidate substance; (c) exposing thegene expression system of (a) to a plurality of candidate substances;(d) assaying a level of reporter gene expression using the geneexpression system of (a) in the presence of a candidate substance of(c); and (e) selecting a candidate substance whose presence results inan altered level of reporter gene expression when compared to thebaseline level.
 16. The method of claim 15, wherein the substance is aprotein, a chemical compound, or a peptide.
 17. A method for identifyinga substance that regulates EP17 expression, the method comprising: (a)creating a transgenic mouse bearing a chimeric transgene comprising thepromoter region of claim 1 or 5 operably linked to a reporter gene,wherein the reporter gene is expressed, and wherein a level of reportergene expression is assayable; (b) assaying a baseline level of reportergene expression in the transgenic mouse in the absence of a candidatesubstance; (c) administering the candidate substance to the transgenicmouse; (d) assaying a level of reporter gene expression in thetransgenic mouse following administration of the candidate substance tothe mouse; and (e) selecting a substance wherein the level of reportergene expression in the transgenic mouse is altered followingadministration of the substance compared to the baseline level.
 18. Amethod for producing an epididymal cell line, the method comprising: (a)creating a transgenic animal bearing a chimeric transgene genecomprising the promoter region of claim 1 operably linked to aselectable marker gene that permits cell growth in the presence of aselective agent; (b) procuring epididymal cells from the transgenicanimal of (a); and (c) reproducing the cells in vitro in the presence ofthe selective agent.
 19. A method for mutagenizing an EP17 locus in avertebrate animal, the method comprising: (a) constructing a targetingvector having the isolated promoter region of claim 1, a marker gene,and an isolated 3′ flanking region of an EP17 gene, wherein the markergene is positioned between the promoter region and the 3′ flankingregion; (b) linearizing the targeting vector by digestion with arestriction endonuclease, wherein the promoter region, marker gene, and3′ flanking region are undigested; (c) introducing the linearized vectorinto embryonic stem cells; (d) detecting the marker gene in theembryonic stem cells; (e) selecting embryonic stem cells having themarker gene; and (f) generating a transgenic vertebrate animal derivedfrom the selected embryonic stem cells, wherein the EP17 locus of theanimal is altered as a result of a homologous recombination eventmediated by the targeting vector.
 20. An isolated EP17 polypeptide, orfunctional portion thereof, comprising: (a) a polypeptide encoded by thenucleotide sequence of SEQ ID NO:3; (b) a polypeptide encoded by anucleic acid molecule that is substantially identical to SEQ ID NO:3;(c) a polypeptide having the amino acid sequence of SEQ ID NO:4; (d) apolypeptide that is a biological equivalent of the polypeptide of SEQ IDNO:4; or (e) a polypeptide which is immunologically cross-reactive withan antibody that shows specific binding with a polypeptide of SEQ IDNO:4
 21. An isolated nucleic acid molecule encoding a human EP17polypeptide.
 22. The isolated nucleic acid molecule of claim 21,comprising: (a) the nucleotide sequence of SEQ ID NO:3; or (b) a nucleicacid molecule substantially identical to SEQ ID NO:3.
 23. The isolatednucleic acid molecule of claim 21, comprising a 20 nucleotide sequencethat is identical to a contiguous 20 nucleotide sequence of SEQ ID NO:3.24. A chimeric gene, comprising the nucleic acid molecule of claim 21operably linked to a heterologous promoter.
 25. A vector comprising thechimeric gene of claim
 24. 26. A host cell comprising the chimeric geneof claim
 24. 27. The host cell of claim 26, wherein the cell is selectedfrom the group consisting of a bacterial cell, a hamster cell, a mousecell, and a human cell.
 28. A method of detecting a nucleic acidmolecule that encodes an EP17 polypeptide, the method comprising: (a)procuring a biological sample having nucleic acid material; (b)hybridizing the nucleic acid molecule of SEQ ID NO:1, 2, 3, or 5 understringent hybridization conditions to the biological sample of (a),thereby forming a duplex structure between the nucleic acid of SEQ IDNO:1, 2, 3, or 5 and a nucleic acid within the biological sample; and(c) detecting the duplex structure of (b), whereby an EP17 nucleic acidmolecule is detected.
 29. An antibody that specifically recognizes anEP17 polypeptide encoded by the nucleotide sequence of SEQ ID NO:3; apolypeptide encoded by a nucleic acid molecule that is substantiallyidentical to SEQ ID NO:3; a polypeptide having the amino acid sequenceof SEQ ID NO:4; a polypeptide that is a biological equivalent of thepolypeptide of SEQ ID NO:4; or a polypeptide which is immunologicallycross-reactive with an antibody that shows specific binding with apolypeptide of SEQ ID NOs:4.
 30. A method for producing an antibody thatspecifically recognizes an EP17 polypeptide, the method comprising: (a)recombinantly or synthetically producing an EP17 polypeptide of claim20, or portion thereof; (b) formulating the polypeptide of (a) wherebyit is an effective immunogen; (c) administering to an animal theformulation of (b) to generate an immune response in the animalcomprising production of antibodies, wherein antibodies are present inthe blood serum of the animal; and (d) collecting the blood serum fromthe animal of (c) comprising antibodies that specifically recognize anEP17 polypeptide.
 31. A method for detecting a level of EP17polypeptide, the method comprising (a) obtaining a biological samplehaving peptidic material; (b) detecting an EP17 polypeptide in thebiological sample of (a) by immunochemical reaction with the antibody ofclaim 29, whereby an amount of EP17 polypeptide in a sample isdetermined.
 32. A method for identifying a substance that modulates EP17function, the method comprising: (a) isolating an EP17 polypeptideencoded by the nucleotide sequence of SEQ ID NO:3; a polypeptide encodedby a nucleic acid molecule that is substantially identical to SEQ IDNO:3; a polypeptide having the amino acid sequence of SEQ ID NO:4; apolypeptide that is a biological equivalent of the polypeptide of SEQ IDNO:4; or a polypeptide which is immunologically cross-reactive with anantibody that shows specific binding with a polypeptide of SEQ ID NO:4;(b) exposing the isolated EP17 polypeptide to a plurality of substances;(c) assaying binding of a substance to the isolated EP17 polypeptide;and (d) selecting a substance that demonstrates specific binding to theisolated EP17 polypeptide.
 33. A method for modulating EP17 function ina subject, the method comprising: (a) preparing a pharmaceuticalcomposition, comprising a substance identified according to the methodof claim 15, 17, 30, or 32, and a carrier; (b) administering aneffective dose of the pharmaceutical composition to a subject, wherebyEP17 activity is altered in the subject.
 34. A method for modulatingEP17 function in a subject, the method comprising: (a) preparing a genetherapy vector having a nucleotide sequence encoding an EP17 polypeptideor a nucleotide sequence encoding a nucleic acid molecule, peptide, orprotein that interacts with an EP17 nucleic acid or polypeptide; and (b)administering the gene therapy vector to a subject, whereby the functionof EP17 in the subject is modulated.
 35. The method of claim 34 furthercomprising the EP17 promoter region of claim 1 or
 5. 36. A method fordiminishing the fertile capacity of a subject, the method comprising:(a) identifying a chemical compound, peptide, or antibody that interactswith the polypeptide of SEQ ID NO:4 or 6; (b) preparing a pharmaceuticalcomposition comprising the chemical compound, peptide, or antibody of(a) and a carrier; and (c) administering an effective dose of thepharmaceutical composition to a subject, whereby the fertile capacity ofthe subject is diminished.
 37. A method for promoting fertility in asubject, the method comprising: (a) identifying a chemical compound orpeptide that interacts with the polypeptide of SEQ ID NO:4 or 6; (b)preparing a pharmaceutical composition comprising the chemical compoundor peptide of (a) and a carrier; and (c) administering thepharmaceutical composition to a subject, whereby the fertility of thesubject is improved.